The effect of hybridization on transposable element accumulation in an undomesticated fungal species

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Transposable elements (TEs) are mobile genetic elements that can profoundly impact the evolution of genomes and species. A long-standing hypothesis suggests that hybridization could deregulate TEs and trigger their accumulation, although it received mixed support from studies mostly in plants and animals. Here, we tested this hypothesis in fungi using incipient species of the undomesticated yeast Saccharomyces paradoxus. Population genomic data revealed no signature of higher transposition in natural hybrids. As we could not rule out the elimination of past transposition increase signatures by natural selection, we performed a laboratory evolution experiment on a panel of artificial hybrids to measure TE accumulation in the near absence of selection. Changes in TE copy numbers were not predicted by the level of evolutionary divergence between the parents of a hybrid genotype. Rather, they were highly dependent on the individual hybrid genotypes, showing that strong genotype-specific deterministic factors govern TE accumulation in yeast hybrids.

eLife digest

Hybrids arise when two populations of organisms that are related but genetically different mate and produce offspring. Hybridization has long been regarded as one of the many ways species evolve. Studying the changes in the genome that result from this process can provide insights into evolutionary history and predict the outcome of mixing between genetically different populations. In fact, the inability of two organisms to mate and produce viable and fertile hybrids has been used as a way to define species. It has been speculated that the infertility of many hybrids is due to short sequences of DNA in the genome called transposable elements. These elements are sequences of DNA that, when active, can move to a different position in the genome, causing mutations. It is thought that the process of hybridization may be activating transposable elements leading to the infertility often observed in hybrids.

The activation of transposable elements in hybrids has been studied in animals and plants, and usually, the hybrids studied were either generated in the laboratory or found in the wild. Fungal species, such as the yeast Saccharomyces paradoxus, have hundreds of wild strains, including many hybrids, and can also be crossed in the laboratory to produce new hybrids, allowing a combined approach to studying the activation of transposable elements. Hénault et al. used this yeast to investigate whether hybridization leads to increased activity of transposable elements in fungi.

To test this hypothesis, Hénault et al. analyzed the genomes of hundreds of natural strains of S. paradoxus to count and locate their transposable elements and establish evolutionary relationships between them. Next, they crossed different strains in the laboratory to see how the transposable elements would act upon hybridization.

If transposable elements were activated by hybridization, then hybrids would accumulate more transposable elements. However, the analyses did not show increased numbers of transposable elements in wild hybrids of S. paradoxus. This could be explained by an actual absence of transposable element activation, or by natural selection eliminating individuals that accumulate more transposable elements. To determine which is the case, Hénault et al. next recreated several hybrids in the laboratory and reproduced them for hundreds of generations. Hybrids were grown in the laboratory such that natural selection was almost incapable of favoring some yeasts over others, allowing the hybrids to accumulate transposable elements. These experiments revealed that hybrids accumulated transposable elements at different and largely unpredictable rates. Indeed, closely related hybrids often had highly different numbers of transposable elements in their genomes after being reproduced in the laboratory.

These observations indicate that the accumulation of transposable elements depends on various factors and cannot be easily predicted, and that hybridization may only be a small piece of the puzzle. Additionally, Hénault et al. demonstrated that undomesticated organisms like fungi can provide unique insights into evolutionary hypotheses.

Introduction

Hybridization is increasingly recognized as an important component of species evolution (Mallet, 2005; Mallet et al., 2016; Stukenbrock, 2016). Abundant empirical evidence shows that hybridization frequently leads to transgressive phenotypes in plants, animals, and fungi (Langdon et al., 2019; Rieseberg et al., 1999). Mutation rates are traits for which transgressive segregation can have a major impact on the evolutionary fate of hybrids, with potential implications for the emergence of reproductive barriers and speciation. Insertions caused by transposable elements (TEs) are a class of mutations that could be particularly responsive to hybridization. TEs are diverse and ubiquitous DNA sequences that can duplicate and spread within genomes, forming families of dispersed repeated sequences (Bourque et al., 2018). They can either replicate through ‘copy-and-paste’ (retrotransposons) or ‘cut-and-paste’ (DNA transposons) mechanisms. TEs were first identified in cytological studies of chromosome instability in Zea mays (McClintock, 1950). In her seminal 1984 perspective, McClintock, 1984 summarized evidence suggesting that TEs respond to various types of ‘genome shocks’, one of which she suspected was hybridization. Since McClintock’s early insights, many studies investigated the association between TEs and hybridization across taxonomic groups and documented increased TE activity in hybrids. We refer to this increase as TE reactivation, but our use of this term implies no mechanistic basis and other terms like derepression are used in the literature. Reactivation was evidenced by increased TE abundance, novel TE insertions, TE epigenetic changes or higher TE transcript levels (Dion-Côté et al., 2014; Kidwell et al., 1977; Labrador et al., 1999; Liu and Wendel, 2000; O'Neill et al., 1998; Picard, 1976; Ungerer et al., 2006). However, hybrid reactivation is not universal, as many studies across a similar taxonomic range found no evidence for it (Coyne, 1989; Göbel et al., 2018; Hey, 1988; Kawakami et al., 2011). In fungi, a recent preprint showed that no TE reactivation occurs in experimental interspecific hybrids between the yeasts Saccharomyces cerevisiae and Saccharomyces uvarum (Smukowski Heil et al., 2020). Thus, TE reactivation exhibits no clear taxonomic distribution, and the conditions favorable for its manifestation are still unclear.

Studies of hybrids in controlled crosses revealed genetic bases to TE reactivation, but less focus was put on cases of natural hybrids (but see Kawakami et al., 2011; Ungerer et al., 2009). In addition to the genetic structure of populations, environmental pressures and demographic factors can influence the action of evolutionary forces on TEs, thus impacting the genomic landscapes of TEs in natural populations, including hybrids. Population genetic models have investigated the action of the major evolutionary forces on mean TE copy number (CN) per individual (Charlesworth and Charlesworth, 1983; Charlesworth and Langley, 1989). Three such forces are deterministic (Charlesworth et al., 1994). First, TE CN increases at a certain transposition rate, which incorporates the various forms of transposition regulation (either by the host or by TEs themselves). Second, TEs are removed from the genome at a certain excision rate. Third, most TE insertions are deleterious, such that individual fitness is assumed to be a decreasing function of CN and natural selection acts to purge TE insertions from populations. The fourth important force is genetic drift, a stochastic force that impacts the efficiency of natural selection by increasing the probability of fixation of deleterious mutations (Lynch, 2007). The intensity of drift is inversely proportional to the effective population size N_e. In populations with small N_e, drift renders natural selection inefficient, such that deleterious mutations are more likely to fix by chance. In principle, any of those evolutionary forces could affect hybrids in a way that differs from their parents, leading to hybrid-specific dynamics in the accumulation of TEs.

Population data in Drosophila indicate that deterministic forces have a predominant influence on the allele frequency spectrum of TE insertions (Charlesworth and Langley, 1989), consistent with a minimal contribution from drift. Among the deterministic forces, transposition rate is arguably the most likely to be impacted by hybridization, for instance if genetic incompatibilities impair transposition regulation mechanisms. A classic illustration of this is hybrid dysgenesis in Drosophila, a syndrome caused by mismatches between paternal TEs and egg-loaded pools of Piwi-interacting RNAs (piRNAs) that lead to inefficient TE repression (Brennecke et al., 2008; Bucheton et al., 1984; Erwin et al., 2015; Rubin et al., 1982). Similarly, hybrids between Arabidopsis thaliana and Arabidopsis arenosa show overexpression of a TE family from the A. arenosa subgenome that is dependent on the ploidy level of the parental accessions involved, suggesting that dosage imbalance of epigenetic regulators causes reactivation (Josefsson et al., 2006). It can also be expected that the number of genetic incompatibilities increases with genetic divergence, yielding a positive relationship between evolutionary divergence and TE deregulation (Orr, 1995; Parhad and Theurkauf, 2019). Additionally, evidence from Drosophila suggests that hybridization may impact excision rates for specific TE families (Coyne, 1989). Along with these deterministic factors, reductions in natural selection efficiency should also be considered. Evidence from Arabidopsis (Lockton et al., 2008; Lockton and Gaut, 2010), Caenorhabditis (Dolgin et al., 2008) and Daphnia (Ye et al., 2017) suggests that long-term differences in N_e caused by demography or mating system are associated with variation in TE content (Lynch and Conery, 2003). A reduction in N_e can be expected when a novel hybrid lineage is formed. Indeed, it has been hypothesized that population bottlenecks inherent with hybrid speciation in Helianthus might have played a role in the accumulation of TEs through a decreased efficiency of purifying selection (Ungerer et al., 2009).

An emerging eukaryotic model system in ecological and evolutionary genomics is the yeast Saccharomyces paradoxus, the undomesticated sister species of the budding yeast S. cerevisiae. S. paradoxus has a complex admixture history in North America, making it an ideal model system to test hybridization-related evolutionary hypotheses in natural populations and in the laboratory (Figure 1; Hénault et al., 2017). North American S. paradoxus lineages comprise the Eurasian lineage SpA, the endemic incipient species SpB and SpC (hereafter termed pure lineages), and the admixed lineages SpC*, SpD₁ and SpD₂ (hereafter termed hybrid lineages; Figure 2a). SpC* is an incipient hybrid species that arose from a post-glacial secondary contact between the endemic lineages SpB and SpC and consists of mostly SpC-like genomes with ~5% introgressions from SpB (Leducq et al., 2016), while SpD₁ and SpD₂ arose more recently from hybridization between SpC* and SpB and have roughly equal genomic contributions from both parents (Eberlein et al., 2019; Figure 2b).

Figure 1

Download asset Open asset

Population genomic data and laboratory evolution experiments allow to uncover the effects of various factors on transposable element (TE) accumulation in hybrids.

Upon hybridization, the TE content of a hybrid is determined by equal contributions from the TE contents of its parents. Subsequent evolution can either leave the hybrid TE content unchanged, or drive it to increase or decrease. These changes can be driven by biases in transposition or excision rates or by variation in natural selection efficiency. Two complementary approaches can be used to investigate which factors drive TE dynamics in hybrids. First, genomic data from natural populations can be harnessed to understand how population structure, environmental pressures and natural selection efficiency shaped natural variation in TE content. Second, evolution experiments in laboratory controlled conditions (constant environment and relaxed natural selection) allow to test the effect of properties of artificial hybrid genotypes on TE accumulation, namely evolutionary divergence between parents and initial TE content.

Figure 2 with 3 supplements see all

Download asset Open asset

Ty elements showed no reactivation in hybrid lineages and contrasted abundances among pure lineages.

(a) Bayesian phylogenetic tree of the North American *Saccharomyces paradoxus* lineages based on genome-wide SNPs. (b) Schematic admixture graph showing the evolutionary relationships between pure and hybrid lineages as inferred by Eberlein et al., 2019; Leducq et al., 2016. Dotted lines represent admixture events. (c) Counts of full-length elements (top) and solo LTRs (bottom) in six whole-genome assemblies based on long reads. No full-length Ty4 or Ty5 were found in any assembly, thus these families were considered extinct. (d) Copy number (CN) variation in active Ty families (i.e. families which have full-length elements) measured as log₂ normalized read depth (NRD) over Ty reference internal sequences for 207 diploid wild strains. Sample size for each lineage (excluding strains for which there was no coverage) is indicated between parentheses. Whiskers span 1.5 times the interquartile range. The p-values of Kruskal-Wallis tests are shown. Stars on bars show the results of Conover post-hoc tests. *: p<0.05, **: p<0.01, ***: p<0.001.

Figure 2—source data 1 Counts of Ty annotations by type from the six whole-genome assemblies.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig2-data1-v2.csv.zip
Download elife-60474-fig2-data1-v2.csv.zip

Yeasts of the Saccharomyces genus are valuable model species to investigate the interplay between hybridization and transposition. Saccharomyces genomes contain many long terminal repeats (LTR) retrotransposons of the Ty1/copia (Ty1, Ty2, Ty4 and Ty5) and Ty3/gypsy (Ty3) families. Ty1/copia and Ty3/gypsy retrotransposon superfamilies (Wicker et al., 2007) respectively correspond to the Pseudoviridae and Metaviridae virus families (Krupovic et al., 2018) and have a broad taxonomic distribution. Ty elements comprise two overlapping open reading frames (ORFs) within a ~5 kb internal sequence, which is flanked by two ~300 bp LTRs in direct orientation. The two LTRs of full-length elements are prone to intra-element recombination, leaving behind solo LTRs in the genome (Parket et al., 1995). Ty1 in the budding yeast S. cerevisiae is among the best characterized eukaryotic TEs, benefiting from decades of research on its molecular and cellular biology and, more recently, on its population genomics (Bleykasten-Grosshans et al., 2013; Curcio et al., 2015). Although Saccharomyces genomes harbor a relatively modest fraction of Ty elements (~1–3% in size), their streamlined genomes (~12 Mbp) and short generation time (~2 hr) provide great power to perform TE comparative genomics and laboratory evolution experiments.

Here, we use population genomic data and laboratory evolution experiments on S. paradoxus and its sibling species S. cerevisiae to investigate the hybrid reactivation hypothesis and, more generally, the factors governing TE accumulation in natural and experimental lineages. Our results show that natural and experimental hybrids exhibit no reactivation of TEs. Rather, we show that deterministic factors like population structure and the properties of individual hybrid genotypes are major determinants of TE content evolution in hybrid genomes.

Results

S. paradoxus lineages exhibit contrasted TE evolutionary dynamics but no reactivation in hybrids

To investigate the TE content and dynamics in North American S. paradoxus pure and hybrid lineages, we annotated six highly contiguous whole-genome assemblies previously generated from Oxford Nanopore long reads (PRJNA514804) (Eberlein et al., 2019). We used a database comprising reference sequences for S. cerevisiae Ty1, Ty2, Ty3, Ty4, and Ty5 families, along with S. paradoxus Ty3-related Ty3p (Carr et al., 2012) and Ty4-related Tsu4, the latter originating from a horizontal transfer from the Saccharomyces eubayanus/Saccharomyces uvarum clade to S. paradoxus (Bergman, 2018). These annotations revealed counts of full-length elements an order of magnitude lower than solo LTRs (Figure 2c, Figure 2—figure supplement 1), consistent with previously reported data on S. cerevisiae and S. paradoxus (Carr et al., 2012; Goffeau et al., 1996; Kim et al., 1998; Yue et al., 2017). To overcome the difficulty of comparing multiple families of short and often degenerated LTR sequences within a phylogenetic framework, we built intra-genome LTR sequence similarity networks. Sequences within the Ty1-Ty2 and Ty3p-Ty3 LTR family pairs were indistinguishable (Figure 2—figure supplement 2), in agreement with Ty2 and Ty3 being specific to S. cerevisiae (Carr et al., 2012; Naumov et al., 1992). Consequently, Ty2 and Ty3 LTR annotations were merged with Ty1 and Ty3p, respectively. Although they were clearly distinct, Ty4 and Tsu4 LTRs showed a close relatedness, which justified their combination into the Ty4/Tsu4 family. No Tsu4 LTR was annotated in the SpA genome, consistent with this family being specific to North American S. paradoxus and absent from Eurasian S. paradoxus (Bergman, 2018).

Based on the assembly annotations, Ty1 was the most abundant family and exhibited the largest CN variation among lineages, with a striking difference between SpB and SpC (Figure 2c). The generality of CN patterns revealed by genome assemblies was confirmed by normalized short-read depth (NRD) over Ty reference sequences for 207 natural diploid isolates (Figure 2d, Figure 2—figure supplement 3, Supplementary file 1a; PRJNA277692, PRJNA324830, PRJNA479851). The hybrid lineages SpC*, SpD₁ and SpD₂ had intermediate full-length Ty1 and Tsu4 CNs compared to their parental lineages, indicating no apparent persistent reactivation or maintenance of Ty elements in these lineages. Notably, Ty3p appeared extinct in all hybrid lineages, with no signal for internal sequences. The most outstanding profile was that of the pure lineage SpC, which had significantly higher full-length Ty1 and Ty3p CNs compared to the other lineages. For Tsu4, SpB was the lineage with the highest CN. Thus, lineage-wide CNs were consistent with an absence of reactivation in hybrids, the most contrasted profiles being found among pure lineages.

CNs alone may not reflect the actual dynamics of a TE family within a genome. For instance, a recently expanding family may not have had the time to reach the CN of another family (or the same family in a different lineage) that would be more abundant but at equilibrium. We thus explored the evolutionary dynamics of Ty families within each genome. If high CNs were caused by recent family expansions, they should be associated with a signature of low divergence (Bergman and Bensasson, 2007). We used the sequence divergence to the most closely related LTR for each LTR annotation to approximate evolutionary divergence since transposition. The importance of low-divergence peaks (≤1%) for Ty1 was significantly higher in SpC and SpA compared to SpB (Fisher’s exact test, FDR corrected, Figure 3a; see Figure 3—figure supplement 1 for other divergence thresholds). SpA also had a significant underrepresentation of low-divergence LTRs for the composite Ty4/Tsu4 family due to the absence of Tsu4. The height of low-divergence peaks was consistent with CNs of full-length elements (Figure 2d). Hybrid lineages SpC*, SpD₁ and SpD₂ had Ty1 peaks similar to SpC, although less important. They also had no peak for Ty3p, consistent with the absence of full-length elements. Overall, this result indicated that CN variation reflected variation in family dynamics and confirmed that Ty elements exhibited no reactivation in hybrid lineages.

Figure 3 with 3 supplements see all

Download asset Open asset

Evolutionary dynamics of Ty families in *S. paradoxus*.

(a) Distributions of minimum nucleotide divergence between long terminal repeat (LTR) sequences. Heatmaps show FDR-corrected p-values for pairwise Fisher’s exact tests between ratios of elements ≤1% and >1% nucleotide divergence. Color maps are centered at the significance threshold of 0.05. Blue cells depict statistically significant tests. See Figure 3—figure supplement 1 for other divergence thresholds. (b) LTR divergence and conservation levels. Horizontal positions depict Ty orthogroups clustered from conserved (left) to private (right). Color map shows minimum nucleotide divergence. Scattered red dots show LTRs that belong to full-length elements. FDR-corrected p-values of Fisher’s exact tests for the ratio of full-length elements in private orthogroups are shown. ns: p≥0.05, *: p<0.05, **: p<0.01, ***: p<0.001. (c) Lineages through time curves for Ty1 ultrametric bayesian phylogenetic trees, using a conserved Ty4 LTR orthogroup as an outgroup. Branching time is scaled such that values of one correspond to the present. From left to right, distributions show all LTRs (with a close-up on recent dynamics), LTRs from private elements and LTRs from full-length elements.

Figure 3—source data 1 Nucleotide divergence data for LTR sequences from the six whole-genome assemblies.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig3-data1-v2.csv.zip
Download elife-60474-fig3-data1-v2.csv.zip
Figure 3—source data 2 Ty orthogroups defined from the six whole-genome assemblies.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig3-data2-v2.csv.zip
Download elife-60474-fig3-data2-v2.csv.zip
Figure 3—source data 3 Branching times for Ty1 LTR sequences from the six whole-genome assemblies.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig3-data3-v2.csv.zip
Download elife-60474-fig3-data3-v2.csv.zip

Next, we tested whether orthology between genome-wide Ty insertion loci was consistent with transposition activity. We defined Ty orthogroups from a whole-genome multiple alignment of the six assemblies (Figure 3b). Most orthogroups were either private to one genome or conserved across the six (Figure 3—figure supplement 2), suggesting that a major fraction of Ty insertion polymorphisms segregated at low frequency in North American lineages. This was consistent with studies in S. cerevisiae showing that insertions are mostly fixed or at low frequency (Bleykasten-Grosshans et al., 2013; Carr et al., 2012). Private orthogroups comprised LTRs that had low divergence, consistent with recent transposition activity (Figure 3—figure supplement 2). Private orthogroups were significantly enriched in full-length Ty1 elements (Figure 3b), including in hybrid lineages. This indicated that all lineages had active transposition for at least some Ty families. It also suggested that an absence of transposition in hybrid lineages could not explain the absence of reactivation. We called lineage-wide insertion polymorphisms from discordant mappings of the 207 short read libraries to estimate the distribution of insertion allele frequencies. This calling method is noisy and only allows to call non-reference alleles, but should nevertheless provide rough estimates of the insertion allele frequency spectra. Principal component analysis (PCA) on non-reference insertions yielded the expected grouping of strains into well-defined lineages (Figure 3—figure supplement 3). The allele frequency distributions showed a similar excess of very-low-frequency alleles for the six lineages (Figure 3—figure supplement 3), suggesting that Ty elements were subject to largely uniform evolutionary forces across lineages.

We focused on the most abundant Ty1 to investigate the transposition dynamics in a phylogenetic framework (Le Rouzic et al., 2013). Bayesian phylogenetic trees of Ty1 LTRs were converted into ultrametric phylogenies to approximate the timing of branching events using a conserved Ty4 LTR orthogroup as an outgroup. Branching times revealed that lineages had largely similar Ty1 dynamics (Figure 3c). Due to the shortness of LTR sequences, most branches collapsed with the outgroup, indicating an expected saturation of the molecular substitution signal. A focus on recent dynamics revealed a clear expansion signature in SpC that was shared by SpC* and, to a lesser extent, by SpD₁ and SpD₂ (Figure 3c), which was reflected by the peaks of young private and full-length elements. This suggests that any recent expansion signature in hybrid lineages can be largely explained by inheritance from their SpC-related parent. The rapid expansion in SpC and SpC* can be contrasted with the more gradual expansion in SpA, which harbored the second highest Ty1 CN. This contrasted with the assumption that TE abundances can be interpreted as equilibrium CNs under the transposition-selection balance hypothesis (Charlesworth and Charlesworth, 1983) and rather suggested a recent transposition rate increase in SpC-related lineages.

Population structure best explains TE content variation, with minor contributions from climatic variation

Our results showed that hybrid lineages exhibited no outstanding Ty CN profiles or sequence dynamics. However, potential hybrid-specific differences in transposition rate might be masked by other factors that can be expected to impact Ty dynamics. To exclude this possibility, we investigated which factors best explain the variation in Ty CN in natural lineages. First, population structure incorporates both host-encoded (Curcio et al., 2015) and Ty-encoded (Czaja et al., 2020; Saha et al., 2015) determinants of Ty transposition rates. Second, environmental factors can influence Ty transposition rate, as demonstrated by temperature (Paquin and Williamson, 1984) and nitrogen starvation (Morillon et al., 2000) for Ty1 in S. cerevisiae. If hybrid lineages experience distinct environmental conditions in nature, this may impact their transposition rates. Finally, the efficiency at which natural selection purges deleterious Ty insertions (which depends on N_e) could vary between lineages, ultimately affecting the transposition-selection balance. To help discriminate global from lineage-specific patterns arising from these factors, analyses with all lineages aggregated were complemented by lineage-specific analyses. However, hybrid lineages were represented by relatively few strains (≤10 in each case), which made it hard to reach meaningful conclusions for some analyses. Thus, some of the lineage-specific analyses were focused on the more abundant SpB and SpC pure lineages.

We first investigated the relative contribution of population structure and environmental factors. We combined genome-wide single nucleotide variants from our population genomic data with climatic (temperature, precipitation, solar radiation, vapor pressure, wind) and water balance data (Abatzoglou et al., 2018). We reduced the dimensionality of both datasets by PCA, using coordinates on the major principal components (PCs) to describe population structure (Figure 4—figure supplement 1) and climatic variation (Figure 4—figure supplement 2, Figure 4—figure supplement 3). We then fitted linear models to compare how genotypes-based PCs (gPCs) and environment-based PCs (ePCs) can explain CN variation in Ty1, Ty3p and Tsu4 (Figure 4a). Globally, gPCs reached higher levels of statistical significance as linear predictors of CN compared to ePCs. gPC1 both explained a high proportion of variance (52%, 9% and 36% for all lineages, SpB and SpC, respectively; Figure 4—figure supplement 1) and was a highly significant linear predictor of Ty1 CN. Linear models that explained Ty CNs with both gPCs and ePCs had generally lower Akaike information criterion (AIC) values than models that included only gPCs (Supplementary file 1b), suggesting that climatic variables have modest although significant contributions in explaining CN variation. Unsupervised clustering of strains along gPCs consistently yielded more homogeneous clusters than ePCs in terms of Ty CN (Figure 4b, Figure 4—figure supplement 4). Overall, these results show that the genetic structure of North American S. paradoxus is generally a much better predictor of Ty CNs across lineages than climatic variables. Since locations where hybrid lineages were sampled have climatic profiles similar to pure lineages (Figure 4—figure supplement 2), exposure to specific environmental conditions appears unlikely to explain the absence of observable Ty reactivation in hybrid lineages.

Figure 4 with 4 supplements see all

Download asset Open asset

Associations between population structure, climatic variables and Ty CNs for all lineages (left), *SpB* only (middle), or *SpC* only (right).

(a) Linear models fitted on Ty CNs with gPCs and ePCs as independent variables. For each PC, the p-value in the linear model is plotted against the proportion of variance explained in the respective PCA. (b) Unsupervised clustering performed on either gPC or ePC coordinates. Cumulative distributions show the standard deviation in Ty CN within each gPC-cluster or ePC-cluster. Distributions are also shown as boxplots, with whiskers spanning 1.5 times the interquartile range. FDR-corrected p-values of two-tailed Mann-Whitney U tests between gPC-cluster and ePC-cluster CN distributions are shown. ns: p≥0.05, *: p<0.05, **: p<0.01.

Figure 4—source data 1 Parameter values for predictors of Ty CNs (gPCs and ePCs) fitted in linear models.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig4-data1-v2.csv.zip
Download elife-60474-fig4-data1-v2.csv.zip
Figure 4—source data 2 gPC-cluster and ePC-cluster membership for wild strains of S. paradoxus.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig4-data2-v2.csv.zip
Download elife-60474-fig4-data2-v2.csv.zip

Variation in natural selection efficiency does not explain TE abundance variation in wild lineages

Most TE insertions are expected to have deleterious effects on the individual in which they occur (Wilke and Adams, 1992), although some can be advantageous (Aminetzach et al., 2005; Filteau et al., 2015; Gresham et al., 2008; Van't Hof et al., 2016). Natural selection acts to purge deleterious insertions, and its efficiency depends on the strength of genetic drift opposing it (which is inversely proportional to N_e). Ty reactivation in hybrid lineages might be counterbalanced by higher efficiency of natural selection in these lineages, although there is no a priori reason to expect higher N_e in hybrids. We thus sought to investigate if CN variation across lineages can be explained by variation in N_e. Genome-wide nucleotide diversity at fourfold degenerate codon positions (π_S) is related to long-term N_e and was used to approximate this parameter across diverse taxa (Lynch and Conery, 2003). Nucleotide diversity was also used at the intraspecific level in S. paradoxus to estimate N_e (Tsai et al., 2008). While pure lineages did exhibit variation in π_S (Supplementary file 1c), there was no obvious negative relationship between Ty CNs and π_S (Figure 5—figure supplement 1), as would be expected under a transposition-selection equilibrium (assuming negligible variation in mutation rate). For instance, nucleotide diversity was higher in SpC than in SpA, while SpC exhibited the highest Ty1 and Ty3p CN. While these rough N_e estimates should be interpreted with caution, they suggested that natural selection efficiency is likely not the main determinant in CN variation. However, our data indicated that some Ty families experienced recent expansions and thus may not be at equilibrium. Recent hybrid lineages also cannot be meaningfully characterized by long-term N_e, further limiting the usefulness of π_S as a measure of the efficiency of natural selection.

We can expect that most of the fitness effects associated with TE insertions are caused by alterations of host functions encoded in cis. Perhaps the TE insertions with the largest functional alteration potential are those which interrupt ORFs of host genes. Across the six genome assemblies, no Ty insertion was found to interrupt an ORF. Lineage-wide Ty insertion calls revealed four Ty insertions predicted to be within host ORFs, three of which are found in SpC strains and one in SpB. All the ORF-disrupting insertions were private alleles, and in all cases the strain harboring them had a CN higher than the median of its lineage (Figure 5—figure supplement 2). Examination of depth of coverage at these four loci revealed narrow peaks in the corresponding strains that were consistent with target site duplications (TSDs) arising from Ty integration (Figure 5—figure supplements 3–6). The absence of ORF-disrupting Ty insertions in hybrid lineages and their scarcity in pure lineages suggest that natural selection is globally very efficient at purging these mutations.

The number of ORF-disrupting Ty insertion alleles in SpC (three) compared to SpB (one) suggested a slight overrepresentation in SpC, which could be explained by a bias towards the accumulation of deleterious insertion alleles in this lineage. To test this hypothesis, we examined the functional annotations of the corresponding genes. There were striking patterns of functional relatedness among genes harboring an ORF-disrupting insertion. Two SpC strains had independent Ty1 insertions at different loci within YDL024C (DIA3, which encodes a protein of unknown function [Palecek et al., 2000]), as evidenced by the corresponding TSDs (Figure 5—figure supplement 3, Figure 5—figure supplement 4). The two other genes were involved in resistance to arsenic compounds: 16_037 (SpB) had a Tsu4 insertion in YPR200C (ARR2, which encodes an arsenate reductase [Mukhopadhyay and Rosen, 1998], Figure 5—figure supplement 5), while LL2011_004 (SpC) had a Ty1 insertion disrupting YPR201W (ARR3, which encodes an arsenite/H⁺ antiporter [Wysocki et al., 1997], Figure 5—figure supplement 6). This functional relatedness could be explained by positive selection; however, this scenario is incompatible with the extremely low frequency at which these insertions segregate. Thus, the most parsimonious explanations for these patterns appears to be either a mutational bias favoring Ty insertions at these loci or fitness effects too small to be visible to natural selection.

We investigated the fitness effect of the ORF-disrupting Ty1 insertion in ARR3 by measuring the growth of a collection of 123 S. paradoxus strains in culture medium supplemented with arsenite (Figure 5a, Figure 5—figure supplement 7, Figure 5—figure supplement 8). Growth in 0.8 mM sodium meta-arsenite (NaAsO₂) showed significant variation between lineages (mixed design ANOVA, p=9.54 × 10⁻³⁹), and more total variance was explained by variation between lineages (lineages: 86.58%, strains: 10.20%, residuals: 3.22%). SpC and SpC* lineages exhibited poor growth in 0.8 mM arsenite compared to SpB and SpA (Figure 5a). This phenotype was generalized to the whole lineages, and while LL2011_004 was in the lower part of the SpC distribution, it was not significantly different from many SpC strains not harboring an insertion (Figure 5b, Figure 5—figure supplement 9). This indicated that the disruption of a major gene involved in arsenite resistance had no further deleterious effect on the poor growth ability of SpC-related strains in presence of arsenite. The absence of growth defect in arsenite for LL2011_004, combined with the extremely low frequencies of all the ORF-disrupting alleles, suggests that these insertions are effectively neutral and are not observed because of an important reduction in purifying selection efficiency.

Figure 5 with 11 supplements see all

Download asset Open asset

Genomic patterns of Ty insertions indicate that natural selection efficiency cannot explain CN variation in natural lineages.

(a) Difference between growth curves in 0.8 mM NaAsO₂ (As) and control medium (synthetic complete, SC). Curves depict the difference in culture optical density (OD) through time, averaged across replicates and bins of four timepoints (one hour). 123 strains were assayed. *SpC* strain LL2011_004, which harbors a Ty1 insertion in *ARR3*, is represented by a dashed line. (b) Area under the curve (AUC) of the difference between growth curves in As and SC. Replicates of 107 strains that passed the criteria of number of replicates and homoscedasticity are shown. The number of strains per lineage is shown between parentheses. Replicates of the *SpC* strain LL2011_004 are highlighted with large empty circle symbols. Lowercase letters indicate groups that are not significantly different following Tukey’s HSD test. (c) Counts of essential and non-essential genes in the immediate vicinity of Ty insertions across the six whole-genome assemblies. Essential to non-essential ratios are shown on top of the bars. (**d-e**) Cumulative distributions of number of protein physical interactions (d) and ω (e) for genes neighboring conserved (dashed line) or private (solid lines) Ty insertions. Genes containing ORF-disrupting insertions are labeled with symbols. Heatmaps show FDR-corrected p-values for pairwise Mann-Whitney U tests between distributions. Color maps are centered at the significance threshold of 0.05. (f) Cumulative distributions of Δk between *SpB* and *SpC*. (g) Counts of neighboring genes with positive or negative Δk values.

Figure 5—source data 1 Growth data in SC and As for wild S. paradoxus strains.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig5-data1-v2.csv.zip
Download elife-60474-fig5-data1-v2.csv.zip
Figure 5—source data 2 Properties of genes neighboring Ty insertions in the six whole-genome assemblies.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig5-data2-v2.csv.zip
Download elife-60474-fig5-data2-v2.csv.zip

To gain a more comprehensive view of the strength of selection acting on Ty insertions, we compared the properties of genes in their neighborhood. Ty elements contain many transcription factor binding sites and can affect the expression levels of adjacent genes (Morillon et al., 2000; Coney and Roeder, 1988; Todeschini et al., 2005; Lesage and Todeschini, 2005). Dosage-sensitive and functionally central genes should on average be subject to stronger selection against mutations that affect their regulation. For instance, in S. cerevisiae, genes that are essential or that have many protein physical interactions were shown to be depleted in eQTLs, i.e. variants that affect expression levels (Albert et al., 2018). However, the association of Ty insertions with regulatory effects on adjacent genes is expected to be highly variable, as transcriptional effects are known to depend on the location, orientation and nucleotide sequence of the inserted element (Coney and Roeder, 1988; Lesage and Todeschini, 2005). Nevertheless, if selection efficiency explains a significant part of Ty CN variation, we reason that when CNs are high (i.e. at low N_e), Ty insertions should be more frequently observed near genes which properties reflect stronger dosage and functional constraints, compared to when CNs are low (i.e. at high N_e). Focusing on the Ty insertions private to each lineage, we found no significant difference in the proportion of neighboring genes annotated as essential in S. cerevisiae (p>0.05, pairwise Fisher’s exact tests, FDR correction), although SpC did exhibit the highest essential to non-essential ratio (Figure 5c). Distributions of physical distance to neighboring genes were similar across lineages, as were distributions of relative orientation (Figure 5—figure supplement 10). Additionally, genes neighboring Ty insertions private to SpC show no bias towards more protein physical interactions (Figure 5d, p>0.05, pairwise two-tailed Mann-Whitney U tests, FDR correction). We fitted the ratio of non-synonymous to synonymous substitutions (ω) to each neighboring gene to estimate the strength of purifying selection on protein-coding sequences. The ω distributions for genes neighboring private Ty insertions exhibited no significant bias (p>0.05, pairwise two-tailed Mann-Whitney U tests, FDR correction). In particular, the distribution in SpC exhibited no skew towards lower values, suggesting no association with genes under strong purifying selection (Figure 5e). Notably, genes harboring ORF-disrupting insertions show little protein physical interactions and no signature of purifying selection (Figure 5d–e).

We next investigated whether lineage-specific selection signatures on genes are associated with Ty insertions. We used the RELAX branch-site codon model (Wertheim et al., 2015) to derive estimates of selection intensity (k) along individual gene trees. Within the RELAX framework, k < 1 indicates relaxed selection and k > 1 indicates strengthened (positive or purifying) selection. We define Δk = k_SpB−k_SpC, corresponding to the difference between the average k values for clades comprising SpB and SpC strains, respectively. Overall, the distribution of Δk values for genes neighboring private SpC insertions was very similar to that for conserved insertions (Figure 5f), while the distribution for private SpB insertions was slightly biased toward genes relaxed in SpB. For both SpB and SpC, there were more private insertions adjacent to genes experiencing relaxed selection in the corresponding lineage, although comparisons with conserved insertions were not statistically significant (p>0.05, Fisher’s exact tests, Figure 5g). This is consistent with efficient purifying selection acting against insertions near genes in both lineages. While genes harboring ORF-disrupting insertions have signatures of stronger selection in SpC (Figure 5e, Figure 5—figure supplement 11), this observation is mitigated by their far-right position in the ω distribution (Figure 5d). Overall, these results show that variation in natural selection efficiency is unlikely to explain a large part of the variation in Ty CNs in natural lineages.

TE content variation in experimental evolution under relaxed selection reveal no systematic effect of hybridization but major effects of individual genotypes

Thus far, our results showed that no reactivation is apparent in natural hybrid lineages, that population structure over environmental variables best explains variation in Ty CNs and that natural selection efficiency is unlikely to explain much variation in Ty genomic landscapes. However, natural selection seems equally efficient at purging deleterious Ty insertions in all lineages, leaving the possibility that strong purifying selection could mask any reactivation in hybrids. Transient reactivation may also be followed by rapid re-establishment of host repression and leave little to no signature in contemporary genomes. Moreover, natural hybrids are somewhat limited in their ability to uncover the interplay between hybridization and transposition, as they are relatively recent introgressed lineages that resulted from admixture between closely related lineages. To systematically explore the effect of hybridization on Ty transposition rates, we performed a large-scale mutation accumulation (MA) experiment on 11 artificial hybrid backgrounds, eight of which were previously published (Charron et al., 2019). A MA experiment with yeast consists in subjecting a population to periodic single-cell bottlenecks (typically during hundreds to thousands of mitotic generations) by repeatedly streaking one randomly chosen isolated colony, thus amplifying genetic drift and making natural selection inefficient (Lynch et al., 2008; Figure 6a). S. paradoxus was inferred to reproduce mitotically 99.9% of the time in the wild (Tsai et al., 2008), making MA experiments a reasonably realistic approximation of its natural life cycle. Pairs of haploid derivatives of natural S. paradoxus and S. cerevisiae strains were chosen to span a range of evolutionary divergence (Figure 6b), yielding four categories of intraspecific crosses (from the least to the most divergent): SpC × SpC (CC), SpB × SpB (BB), SpB × SpC (BC) and SpB × SpA (BA); and one category of interspecific crosses: SpB × S. cerevisiae (BSc; Figure 6c, Supplementary file 1d-f). The crosses used also spanned a range of variation in parental Ty CNs (Figure 6d). For each cross, 48 to 96 MA lines derived from independent matings were propagated for ~770 mitotic divisions.

Figure 6 with 1 supplement see all

Download asset Open asset

Ty copy number (CN) variation in mutation accumulation (MA) lines reveal no effect of parental divergence and major genotype-specific effects.

(a) Principle of an MA experiment on a yeast population. N corresponds to population size. (b) Bayesian phylogenetic tree of the parental haploid strains used to generate artificial hybrids. (c) Design of the crosses for the MA experiment. (d) Ty CNs in the haploid parental strains of each cross in c. (e) CN at the onset (T_ini, empty boxes) and at the end (T_end, filled boxes) of the MA experiment. Whiskers span 1.5 times the interquartile range. FDR-corrected p-values of Wilcoxon signed-rank tests and Brown–Forsythe tests for the equality of variances are shown in black and blue, respectively. ns: p≥0.05, *: p<0.05, **: p<0.01, ***: p<0.001. (f) Linear models explaining the mean variation in CN (ΔCN = CN_Tend−CN_Tini) for a given cross as a function of the joint CN and genetic divergence of its parental strains. The p-value for each term in the models is shown. Scatter plots (CCPR plots) show the partial residuals for each independent variable, along with the fitted regression.

Figure 6—source data 1 Ty CNs for the haploid parental strains of the MA crosses.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig6-data1-v2.csv.zip
Download elife-60474-fig6-data1-v2.csv.zip
Figure 6—source data 2 Evolutionary divergence between haploid parental strains of the MA crosses.: https://cdn.elifesciences.org/articles/60474/elife-60474-fig6-data2-v2.csv.zip
Download elife-60474-fig6-data2-v2.csv.zip

We selected a random subset of 23 to 40 MA lines per cross and performed whole-genome resequencing on stocks frozen at T_ini and T_end, totaling 372 strains sequenced at each timepoint (PRJNA515073). All these strains were confirmed to be diploid at T_ini and T_end by quantifying total DNA content using flow cytometry (Charron et al., 2019; Marsit et al., 2020). Focusing on full-length Ty1, Ty3p, and Tsu4, no global trend emerged from the variation of NRD before and after MA (Figure 6e, Figure 6—figure supplement 1, Supplementary file 1f). Mixed effects linear models fitting cross-specific intercepts and slopes showed no statistically significant support for a global effect of time on CNs. Nevertheless, specific combinations of crosses and Ty families showed significant differences in either central tendency or variance. Often, crosses within a category of parental strain divergence showed contrasting patterns. For example, although CC1, CC2, and CC3 are genetically very similar SpC × SpC crosses, only CC1 exhibits a significant decrease in Ty1 and Tsu4 CNs. There were also cases of opposite trends within the same category of cross, for instance Ty1 CN in BC and BA crosses. This showed that individual parental genotypes, even if they are evolutionarily closely related, can have very different dynamics, both in sign and magnitude.

We next sought to understand how these dynamics are determined by properties of the different hybrid genotypes tested. We fitted linear models to the variation in CN between T_ini and T_end (ΔCN), using the sum of CNs of both parental strains of a cross and the evolutionary divergence between them as independent variables. Although these models explain little global variance (R² = 0.40, 0.08, and 0.06 for Ty1, Ty3p, and Tsu4, respectively), we found that neither independent variable was a significant predictor of ΔCN (Figure 6f). Only Tsu4 showed a slightly positive regression slope with respect to evolutionary divergence, and none of them was statistically significant, indicating that divergence between parents of a hybrid has no large influence on its CN dynamics. On the other hand, Ty1 ΔCN had a negative relationship with parental Ty1 CN and was marginally significant (p=0.056), suggesting that the initial CN of a hybrid genotype can dictate the sign and magnitude of its CN change. However, this result should be interpreted with care as some of the crosses shared common parental strains and were thus not entirely independent. This relatedness may in fact explain part of the pattern seen for Ty1. Crosses that shared the SpB parental strain UWOPS-91–202 (BB2, BC2, BA2, BSc2) seemed to have systematically higher Ty1 ΔCN values than the corresponding crosses that shared the SpB parental strain MSH-604 (BB1, BC1, BA1, BSc1). UWOPS-91–202 appeared to be a naturally Ty-less strain, as no full-length element was found for any family in its genome (Figure 6d). Previous reports of such Ty-less strains were shown to be permissive to the accumulation of Ty elements when assayed in the laboratory using a reporter assay (Czaja et al., 2020; Garfinkel et al., 2003). This observation also supports the hypothesis that genotype-specific characteristics can be major determinants of variation in Ty CN.

Discussion

The literature on TE reactivation in hybrids demonstrates its relevance to the field of speciation (Serrato-Capuchina and Matute, 2018). Reactivation can be a source of postzygotic reproductive isolation by compromising genome stability (Dion-Côté and Barbash, 2017), which translates into a spectrum of deleterious hybrid phenotypes spanning sterility, developmental abnormalities and inviability. Many studies investigated the TE reactivation hypothesis at the interspecific level and yielded equivocal evidence. For instance, mixed evidence for TE reactivation was found within the Drosophila genus (Coyne, 1989; Hey, 1988; Labrador et al., 1999; Vela et al., 2014), where reactivation seems to depend on the species pair involved. On the other hand, three well-established hybrid species that emerged from parallel hybridization between the same species pair in the genus Helianthus all exhibit important reactivation of LTR retrotransposons, both at the genomic (Ungerer et al., 2009; Ungerer et al., 2006) and transcriptomic (Renaut et al., 2014) levels. However, a study of hybrid zones between the two parental species revealed that no reactivation is apparent in recent hybrid populations, casting doubt on the idea that hybridization alone triggered reactivation and suggesting the involvement of other factors like abiotic stress (Kawakami et al., 2011). Overall, this data nuances the hypothesis that hybridization could act as a systematic trigger of TE proliferation and reproductive isolation.

In this study, we used population genomic data on North American lineages of the yeast S. paradoxus to test whether TEs were reactivated in natural hybrids. It was previously shown that these lineages experienced two hybridization events that resulted in the establishment of admixed lineages that have partial postzygotic reproductive isolation with their respective parental lineages (Eberlein et al., 2019; Leducq et al., 2016). We found no evidence of TE reactivation in any of these natural hybrid lineages. Our data showed that while some hybrid lineages may have relatively high CNs and exhibit evolutionary dynamics consistent with recent expansions, these signatures were never more pronounced than that of their parental lineages (SpC in the case of SpC* and SpC* in the case of SpD₁ and SpD₂). Moreover, our analyses suggest that environmental and demographic factors are unlikely to be major factors affecting TE CNs in natural lineages, which therefore likely reflect the underlying transposition rates. Our observations would seem consistent with a role of TE reactivation in speciation, as natural hybrids with high levels of reactivation would likely be unobserved. However, they contrast with long-established hybrid species in Helianthus, showing that reactivation and long-term maintenance can be compatible.

We note that the analysis of contemporary genomes is inherently limited in the ability to uncover transient reactivation that would immediately follow hybridization and rapidly faint, perhaps as a result of a re-establishment of TE repression. However, this scenario may be unlikely in yeast, given the absence of many epigenetic and post-transcriptional repression mechanisms (see discussion below) that can yield such a rapid re-establishment in other species. For instance, the re-establishment of piRNA-mediated P element repression in Drosophila P-M dysgenic hybrids occurs in single female individuals because paternally inherited piRNA clusters become expressed as they age (Khurana et al., 2011). If a hypothetical re-establishment of Ty repression was rather driven by selection for genomic variants, this should result in a gradual improvement of repression with time. Our data provides no support for this hypothesis, as younger hybrid lineages SpD₁ and SpD₂ do not show more pronounced signatures of recent Ty expansion compared to the older SpC* lineage. Additionally, Ty CNs of artificial hybrids at the initial time point of our MA experiment (i.e. early after hybridization) generally show good agreement with the expectation from their parental haploid strains, which is incompatible with an early reactivation of Ty elements.

Many factors are susceptible to determine whether a hybrid genotype exhibits reactivation or not. According to the Bateson-Dobzhansky-Muller (BDM) model, genetic incompatibilities should accumulate as populations diverge (Orr, 1995). Additionally, the evolutionary conflict between host genomes and TEs results in an arms race that can accelerate the evolution of the genes involved (Blumenstiel et al., 2016; Van Valen, 1973). Thus, it can be expected that incompatibilities affecting transposition rates are more likely to be observed in hybrids between highly diverged genotypes compared to closely related ones. Here, we report data from MA experiments on artificial hybrids that probe the relationship between evolutionary divergence and transposition rates. We show that contrary to the expectation from the BDM model, hybrids between more divergent genotypes exhibit no deregulation of TEs. This conclusion is consistent with recent results from Smukowski Heil and colleagues, who showed that no Ty1 reactivation occurs in experimental interspecific hybrids between S. cerevisiae and S. uvarum (Smukowski Heil et al., 2020). Instead of reactivation, many hybrid genotypes exhibited marked decreases in TE CNs after evolution. This trend is especially important for Ty3p, as eight out of ten genotypes exhibit a negative change in median CN. Ty3p is also the least abundant Ty family in natural lineages, as it is found in rarely more than ~5 copies and extinct in all hybrid strains sampled. Such low abundance families are likely more prone to extinction, as stochastic fluctuation in CN would more frequently lead to the loss of all copies. Stochastic loss of Ty copies may come from intra-element LTR recombination or loss of heterozygosity at a locus where a Ty element is present in one parent only, although the mechanistic causes of CN decrease remain to be investigated. This result also seems consistent with the reported Ty CN decrease in mitotically propagated experimental populations of S. cerevisiae (Bast et al., 2019), suggesting a potential intrinsic bias towards TE loss. However, it should be noted that the latter study used haploid populations in a regime with efficient natural selection (McDonald et al., 2016), thus limiting the relevance of comparisons with our results.

Transposition is a complex cellular process, with multiple steps regulated by both TEs and host genomes. Genetic screens in S. cerevisiae revealed that genes involved in multiple cellular functions (chromatin organization, transcription, translation, RNA and protein metabolism) affect Ty1 transposition rate, either positively or negatively (reviewed in Curcio et al., 2015). Thus, it is possible that divergent hybrids suffer from misregulation at different steps of the transposition process that have antagonistic effects, such that they cancel out at the level of transposition rates. It is also possible that such antagonistic interactions exist between transposition and excision rates, yielding no net change in CN. Answering these questions will require a thorough investigation of novel Ty insertion loci in the MA lines, which is beyond the scope of this study. Transposition regulation being a polygenic trait with a large mutational target is also consistent with the strong genotype specificity we observed in our artificial hybrids. Some of the contrasted dynamics between crosses that share high genetic similarity might originate from cryptic genetic variation in genes involved in Ty regulation, whose effect on transposition rate would be uncovered in hybrids.

TEs are structurally very diverse and populate the genomes of a vast range of species, and the control mechanisms in response to their mobilization may exhibit a correspondingly large diversity. In many eukaryotes, TEs are repressed epigenetically through mechanisms that include DNA methylation, chromatin modification, and mRNA degradation, processes in which small RNAs play central roles (Lisch, 2009; Slotkin and Martienssen, 2007). In many fungi, TE repression acts via multiple mechanisms involving RNA interference (RNAi) or cytosine methylation, including repeat-induced point mutation (RIP) (Gladyshev, 2017). Notably, introgressed species of the fungal genus Coccidioides harbor genomic signatures of efficient RIP (Neafsey et al., 2010), suggesting that this mechanism may remain active in the face of hybridization. Unlike other eukaryotes, S. cerevisiae and its closely related species lost the genes of the RNAi pathway (Drinnenberg et al., 2009) and have no cytosine methylation (Proffitt et al., 1984). Instead, S. cerevisiae Ty1 has a potent self-repression mechanism, with a strength that depends on the actual Ty1 CN in the host genome (thus termed CN control, CNC) (Garfinkel et al., 2003). CNC is a post-transcriptional mechanism that relies on p22, an N-truncated form of the Gag (p49) Ty1-encoded capsid protein which is translated from an alternative transcript of Ty1 (Saha et al., 2015). p22 inhibits transposition by interfering with the assembly of the cytoplasmic virus-like particles that are essential to the replication cycle of Ty1 (Saha et al., 2015). This unique repression mechanism may explain some of the patterns we observed in the artificial hybrids. In particular, our data suggests a negative relationship between the initial Ty1 CN in hybrid genotypes and the change in Ty1 CN between the onset and the end of the MA experiment. This pattern seems consistent with an efficient action of CNC, although our results would then suggest that CNC efficiency is largely independent of the degree of evolutionary divergence between the subgenomes of a hybrid. This contrasts with the recent finding that interspecific sequence divergence in p22 has a dramatic impact on the action of CNC (Czaja et al., 2020), although such impact is likely mitigated by the generally much lower divergence levels in our crosses. Thus, further experimental work is needed to disentangle the mechanistic contributions of Ty1 CN and sequence divergence in the regulation of transposition rates in Saccharomyces hybrids.

In conclusion, our work illustrates how the use of natural model systems that are amenable to experimental investigations can bring novel insights on long-standing evolutionary hypotheses. Our results show that natural hybrid lineages of the wild yeast S. paradoxus exhibit no TE reactivation and suggest that dynamics of TE content evolution in hybrids depend on deterministic factors that evolutionary divergence between parents can hardly predict.

Materials and methods

Key resources table

Reagent type (species) or resource	Designation	Source or reference	Identifiers	Additional information
Sequence-based reagent	Forward primer to construct the deletion cassette for the ADE2 gene in SpC	Charron et al., 2019	CLOP97-F5	acaattaaggaatcaagaaaccgt gataaaaaattcaagt CAGCTGAAGCTTCGTACGC
Sequence-based reagent	Reverse primer to construct the deletion cassette for the ADE2 gene in SpC	Charron et al., 2019	CLOP97-F6	gtaattgttcgctggccaagtata ttaatacatttatata GCATAGGCCACTAGTGGATC
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	Charron et al., 2014	LL2011_001 MATa hoΔ::kanMX4
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	This study	LL2011_001 MATa hoΔ::kanMX4 ade2Δ::hphNT1	Haploid yeast strain with ho and ade2 deletions. See Materials and methods section Mutation accumulation
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	Leducq et al., 2016	LL2011_012 MATa hoΔ::kanMX4
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	This study	LL2011_012 MATa hoΔ::kanMX4 ade2Δ::hphNT1	Haploid yeast strain with ho and ade2 deletions. See Materials and methods section Mutation accumulation
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	Charron et al., 2014	MSH-587–1 MATa hoΔ::natMX4
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	This study	MSH-587–1 MATa hoΔ::natMX4 ade2Δ::hphNT1	Haploid yeast strain with ho and ade2 deletions. See Materials and methods section Mutation accumulation
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	Charron et al., 2019	LL2011_004 MATα hoΔ::kanMX4 ade2Δ::hphNT1
Strain, strain background (Saccharomyces paradoxus)	Haploid parent for MA lines	Charron et al., 2019	LL2011_009 MATα hoΔ::natMX4 ade2Δ::hphNT1

Share this article

Cite this article

Population genomic data and laboratory evolution experiments allow to uncover the effects of various factors on transposable element (TE) accumulation in hybrids.

Ty elements showed no reactivation in hybrid lineages and contrasted abundances among pure lineages.

Figure 2—source data 1

Evolutionary dynamics of Ty families in S. paradoxus.

Figure 3—source data 1

Figure 3—source data 2

Figure 3—source data 3

Associations between population structure, climatic variables and Ty CNs for all lineages (left), SpB only (middle), or SpC only (right).

Figure 4—source data 1

Figure 4—source data 2

Genomic patterns of Ty insertions indicate that natural selection efficiency cannot explain CN variation in natural lineages.

Figure 5—source data 1

Figure 5—source data 2

Ty copy number (CN) variation in mutation accumulation (MA) lines reveal no effect of parental divergence and major genotype-specific effects.

Figure 6—source data 1

Figure 6—source data 2

Author details

Mathieu Hénault

Contribution

For correspondence

Competing interests

Souhir Marsit

Contribution

Competing interests

Guillaume Charron

Present address

Contribution

Competing interests

Christian R Landry

Contribution

Competing interests

Citations by DOI

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Further reading