Asexual reproduction reduces transposable element load in experimental yeast populations
Abstract
Theory predicts that sexual reproduction can either facilitate or restrain transposable element (TE) accumulation by providing TEs with a means of spreading to all individuals in a population, versus facilitating TE load reduction via purifying selection. By quantifying genomic TE loads over time in experimental sexual and asexual Saccharomyces cerevisiae populations, we provide direct evidence that TE loads decrease rapidly under asexual reproduction. We show, using simulations, that this reduction may occur via evolution of TE activity, most likely via increased excision rates. Thus, sex is a major driver of genomic TE loads and at the root of the success of TEs.
https://doi.org/10.7554/eLife.48548.001eLife digest
The genetic information of most living organisms contains parasitic invaders known as transposable elements. These genetic sequences multiply by copying and pasting themselves through the genome, but this process can disrupt the activity of important genes and put the organism at risk.
How transposable elements proliferate in a population depends on the way organisms reproduce. If they simply clone themselves asexually, the selfish elements cannot spread between the different clones. If the organisms mate together their respective transposable elements get mixed, which helps the sequences to spread more easily and to potentially become more virulent. However, sexual reproduction also comes with mechanisms that keep transposable elements in check.
Bast, Jaron et al. took advantage of the fact that yeasts can reproduce with or without mating to explore whether sexual or asexual organisms are better at controlling the spread of transposable elements. The number of copies of transposable elements in the genomes of yeast grown sexually or asexually was assessed. The results showed that sexual populations kept constant numbers of selfish elements, while asexual organisms lost these genomic parasites over time. Simulations then revealed that this difference emerged because a defense gene that helps to delete transposable elements was spreading more quickly in the asexual group.
The work by Bast, Jaron et al. therefore suggests that sex is responsible for the evolutionary success of transposable elements, while asexual populations can discard these sequences over time. Sex therefore helps genetic parasites, somewhat similar to sexually transmitted diseases, to spread between individuals and remain virulent.
https://doi.org/10.7554/eLife.48548.002Introduction
Self-replicating transposable elements (TEs) can occupy large fractions of genomes in organisms throughout the tree of life (reviewed in Hua-Van et al., 2011). Their overwhelming success is driven by their ability to proliferate independently of the host cell cycle via different self-copying mechanisms involving ‘cut-and-paste’ or ‘copy-and-paste’ systems. These mechanisms allow TEs to invade genomes in a similar way to parasites, despite generally not providing any advantage to the individual carrying them (Doolittle and Sapienza, 1980; Orgel and Crick, 1980). To the contrary, TEs generate deleterious effects in their hosts by promoting ectopic recombination and because most new TE insertions in coding or regulatory sequences disrupt gene functions (Finnegan, 1992; Montgomery et al., 1991).
Theory predicts that sexual reproduction can either facilitate or restrain the genomic accumulation of TEs, and it is currently unclear whether the expected net effect of sex on TE loads is positive or negative. Sexual reproduction can facilitate the accumulation of TEs because it allows TEs to colonise new genomes and spread throughout populations (Hickey, 1982; Zeyl et al., 1996). Because the colonisation of new genomes is more likely for active TEs, sexual reproduction should favour the evolution of highly active TEs (Charlesworth and Langley, 1986; Hickey, 1982), even though increased activity generates higher TE loads in the host genome. At the same time, sexual reproduction facilitates the evolution of host defences and increases the efficacy of purifying selection against deleterious TE copies by reducing selective interference among loci (Ågren and Wright, 2011; Arkhipova and Meselson, 2005; Crespi and Schwander, 2012; Wright and Finnegan, 2001). In the absence of sex, reduced purifying selection can thus result in the accumulation of TEs, unless TE copies get eliminated via excision at sufficiently high rates (Burt and Trivers, 2006; Dolgin and Charlesworth, 2006).
Genomic TE loads have been empirically estimated for natural populations of asexual and related sexual organisms, but no consistent difference emerges (Ågren et al., 2015; Bast et al., 2016; Jiang et al., 2017; Szitenberg et al., 2016), probably because many confounding factors not related to reproductive mode such as hybridisation and polyploidisation can affect TE loads and mask the effect of sex (Arkhipova and Rodriguez, 2013; Hua-Van et al., 2011).
Here, to quantify whether the net effect of sexual reproduction on genomic TE loads is positive or negative, we study the evolution of genomic TE loads in experimental yeast (Saccharomyces cerevisiae) populations generated in a previous study (McDonald et al., 2016). McDonald et al. maintained four sexual and four asexual strains originating from the same haploid ancestral strain (W303) under constant conditions over 1000 generations. For sexual strains, a mating event (meiosis) was induced every 90 generations. Sequencing of each strain was conducted at generation 0 and every 90 generations prior to mating (for details see Materials and methods, and McDonald et al., 2016). In the present study, we use the published Illumina data to quantify TE loads in each strain for each sequenced generation.
TEs in S. cerevisiae are well characterised (Carr et al., 2012; Castanera et al., 2016; Voytas and Boeke, 1992). S. cerevisiae TEs consist solely of ‘copy-and-paste’ elements that are flanked by long terminal repeats (LTRs) and are grouped into the families Ty1-Ty5 (Voytas and Boeke, 1992). The 12.2 Mb genome of the studied yeast strain comprises approximately 50 full-length, active Ty element copies, and 430 inactive ones (Carr et al., 2012). Inactive copies include truncated elements as well as remnants from TE excisions (i.e., solo-LTR formation; Carr et al., 2012). Excisions occur by intra-chromosomal recombination between the two flanking LTRs of a TE, and result in the removal of protein-coding genes that allow for transposition.
Using different computational approaches to quantify genomic TE loads in experimental yeast strains, we show that sex is required for the success of TEs, as TE loads decrease over time under asexual reproduction. For the first approach, we quantified total TE loads without distinguishing between active and inactive TEs. This was done by computing the fraction of reads that mapped to a curated S. cerevisiae TE library (see Materials and methods) for each yeast strain and sequenced generation. This analysis revealed that the total TE load in sexual strains remained constant over 1000 generations, but decreased in asexual strains over time (resulting in a total reduction of 23.5% after 1000 generations; generation effect p<0.001, reproductive mode effect p=0.081, and interaction between generation and mode p<0.001; permutation ANOVA, Figure 1—figure supplement 1). For the second approach, we focused on full-length TE copy insertions, because only those are active and can lead to increased genomic TE loads over time. Detecting specific TE insertions by aligning short-read data to a reference genome is difficult and associated with a detection bias towards TEs present in the reference genome. Moreover, because sequencing was done with population pools and not individual clones within populations, it is not possible to analyse turnover or activity of TEs within specific genomic backgrounds. Instead, we analysed the presence versus absence of specific TE insertions in each population over time. With a pipeline that combines different complementary approaches (Nelson et al., 2017, see Materials and methods), the available sequencing data allowed us to detect 24 out of the 50 full-length insertions that are present in the reference genome of the ancestral strain at the start of the experiment (generation 0). As with the first approach, we found that the number of (detectable) full-length TE copies remained constant in sexual yeast strains, but decreased in asexual strains over time (generation effect p=0.006, reproductive mode effect p=0.033, and interaction between generation and mode p<0.001; permutation ANOVA). In asexual strains, the estimated average number of full-length TEs decreased from approximately 50 to 41 over 1000 generations (Figure 1).

Sex maintains constant TE loads through time, while its absence leads to TE copy number reductions, for both (A) empirical data and (B) simulations including an allele modifying TE activity rates.
(A) Number of full-length TE copies inserted in genomes of four replicates of otherwise identical occasionally sexual (red) and wholly asexual (blue) yeast strains over 1000 generations of experimental evolution. Numbers are expressed as residuals, since the TE detection probability depends on sequencing coverage (Figure 1—figure supplement 2). (B) Individual-based simulations for studying the TE load dynamics expected under sexual and asexual reproduction with ten replicates (red and blue dotted lines). The simulations are parameterised with yeast-specific values and include a modifier alleles. For both (A) empirical and (B) simulation data, asexuals lost about nine active, full-length TEs by generation 1000. Lines represent linear regression for sexuals (red) and asexuals (blue) and the grey areas represent 95% CI.
This decrease could be generated by either increased TE excision rates in asexual as compared to sexual yeast, reduced transposition rates, or a combination of both mechanisms. To evaluate the relative importance of the two mechanisms, we estimated the number of losses of TEs present in the ancestral yeast strain, as well as the number of novel insertions, at each assayed generation (Figure 2). These analyses revealed that ‘ancestral’ TE insertions are lost at a higher rate in asexual than sexual strains (generation effect p=0.002, reproductive mode effect p=0.027, and interaction between generation and mode p<0.001; permutation ANOVA), while we detected similar numbers of novel TE insertions (indicating similar transposition rates) under both reproductive modes (generation effect p=0.338, reproductive mode effect p=0.271, and interaction between generation and mode p=0.599; permutation ANOVA). Taken together, our empirical observations indicate that even very rare events of sex (here just 10 out of 990 reproduction events) are sufficient to maintain genomic TE loads, while asexuality results in the reduction of TE loads.

Decrease of insertions in asexuals over time is largely due to loss of ‘ancestral’ reference insertions (A) rather than novel insertions (B).
Count of all TE insertions, irrespective whether full-length TE, solo LTR, truncated elements or other types in genomes of four replicates of sexual (red) and asexual (blue) yeast strains over 1000 generations of experimental evolution. Numbers are expressed as residuals, since TE detection probability depends on sequencing coverage. Lines represent linear regression for sexuals (red) and asexuals (blue) and the grey areas represent 95% CI.
The parallel reduction of TE loads in different asexual strains suggests that the evolution of reduced TE activity (the ratio of transposition to excision) in asexual strains influences genomic TE loads more strongly than purifying selection, which should act to reduce TE loads most effectively in sexual strains. To evaluate whether these findings are plausible, we tested whether the net loss of TEs under asexualitly is predicted by a simple model of TE dynamics. As explained above, different theoretical approaches have shown that both purifying selection and activity rate evolution can affect TE loads under sexual or asexual reproduction (Charlesworth and Langley, 1986; Dolgin and Charlesworth, 2006; Hickey, 1982). However, no theoretical study has considered TE load evolution under the joint effects of the different processes. To fill this gap, we extended the individual-level simulation program of Dolgin and Charlesworth (2006). This program allows to study the evolution of TE copy numbers in an asexual lineage as a function of TE activity (the joint effects of transposition and excision rates), as well as of the strength of selection against TE insertions, which depends on the fitness cost per TE insertion. To compare TE loads in sexual and asexual lineages, we first extended the program to include events of sexual reproduction and parameterised the simulations with empirically determined values from yeast (Blanc and Adams, 2004; Carr et al., 2012; Garfinkel et al., 2005). We ran individual-based simulations with a range of transposition rates, excision rates and selection coefficients with and without epistasis between TE copies as pertinent for yeast (see Supplementary file 2A).
For all simulations, TE loads in populations undergoing sex every 90 generations decreased faster than in asexual populations, contrary to our empirical observations. This occurs because sexual events generate variation among individuals in TE loads (and thus variation in fitness), which facilitates selection against deleterious TEs (see also Dolgin and Charlesworth, 2006). Different transposition rates under meiosis (sex) or mitosis (asex) did not affect this finding. Indeed, increased TE activity during meiosis only transiently increases TE loads in sexual strains. Because such activity also generates increased variation in TE loads (and therefore in fitness) among strains, the additional TE copies generated during meiosis are rapidly removed by purifying selection (Figure 1—figure supplement 3). In short, none of the simulations generated the empirically observed pattern of lower TE loads in asexual than sexual strains. In a second step, we therefore allowed TE activity rates to evolve over time, by introducing a modifier allele that increases excision rates. The allele has no direct fitness effect, so it can only be fixed in a population via genetic hitchhiking. In simulations that included the modifier allele, the modifier spreads rapidly to fixation in asexual strains, because it is associated with genomes that have fewer TE copies, and therefore have a higher relative fitness. As a consequence, TE activity rates decrease in asexual populations (Figure 1—figure supplement 4). By contrast, the modifier cannot spread as rapidly in sexual populations because recombination constantly breaks up the association between the modifier and less TE loaded backgrounds. By allowing for the evolution of TE activity rates in our simulations, we were able to identify parameter values representative for yeast that result in simulations with a very close fit to our empirical results (Figure 1B, Supplementary file 2B). These analyses thus corroborate our empirical findings that a likely mechanism driving genomic TE load reduction in asexual yeast strains is the rapid evolution of increased TE excision rates. A similar effect would be expected if our modifier acted on transposition rather than excision rates, since the net TE activity depends on the relative rates of transposition vs excision. However, our empirical results do not suggest major differences in transposition rates between sexual and asexual yeast strains. In combination with our findings that, in the absence of TE activity evolution, sexual strains always lose TEs faster than asexual ones, the empirical results are best explained by an increase in TE excision rates under asexuality (Figures 1 and 2).
Our study shows that sexual reproduction permits the maintenance of TEs in S. cerevisiae, while in its absence, TE loads decrease, likely via the evolution of TE activity rates. The findings are consistent with empirical findings of low TE activity in old asexuals (Bast et al., 2016) and the idea that TEs should evolve to be benign in asexual species, because the evolutionary interests of TEs and their host genome are aligned (Charlesworth and Langley, 1986). While the exact mechanisms causing TE activity change in the asexual yeast populations cannot be assessed in the empirical data, our simulations suggest that there is some form of TE defense mechanism (a ‘modifier locus’) that either segregates in the ancestral yeast strain used in the experiments or repeatedly appeared de novo during experimental evolution. Independently of the exact mechanism, we confirm that TE loads do not increase, but decrease, in asexual populations. This contrasts with the hypothesis that most asexual species are evolutionarily short lived because they are driven to extinction via negative consequences of accumulating TE copies (Arkhipova and Meselson, 2005). Instead, sex is at the root of the evolutionary success of parasitic TEs.
Materials and methods
Yeast experimental evolution
Request a detailed protocolWe used data generated in a previous study based on experimental evolution of the yeast S. cerevisiae (for in-depth details see McDonald et al., 2016). In short, 12 different strains were initiated from the same pool of ancestral strains (derived from haploid W303 strains) and kept under constant conditions. Sexual reproduction in yeast depends on the presence of two separate mating types. Only individuals with different mating types can fuse and go through meiosis. Asexual reproduction occurs through budding. For the experiment, six haploid strains consisting of mating type a (MATa) and six haploid strains of mating type α (MATα), were grown over 990 generations. Of these, four strains were grown exclusively asexually (two of MATa, two of MATα), while the eight others (four of MATa, four of MATα) were mixed for mating events every 90 generations, resulting in four sexual strains. Paired-end Illumina reads were generated for each of the 12 different strains every 90 generations during 990 generations (for a total of 11 sequencing events per strain). Read numbers per sample ranged from 12,775 to 10,270,312, averaging 2,964,869 reads per sample, with a total of 818,303,966 reads. Details of the read data can be found at BioProject PRJNA308843 and in the original study (McDonald et al., 2016).
Data processing
Request a detailed protocolThe genome of the haploid W303 S. cerevisiae strain was retrieved from Lang et al. (2013). All Illumina paired-end raw reads of the 12 replicate strains generated in McDonald et al. (2016) were downloaded from the SRA (BioProject identifier PRJNA308843). Raw reads were quality filtered by first removing adapter sequences (with the script used in the original study; McDonald et al., 2016; provided by Daniel P Rice, Harvard University), followed by removing the first 10 bases and quality trimming using trimmomatic v0.33 (Bolger et al., 2014) with parameters set to LEADING:3 TRAILING:3 HEADCROP:10 SLIDINGWINDOW:4:15 MINLEN:36. Additionally, non-overlapping paired-reads were constructed in silico from the subset of the original paired-reads that were overlapping, as a prerequisite to run the insertion detection pipeline. For this, overlapping reads (on average overlapping by 16 bp) were merged using PEAR v0.9.6 with standard parameters (Zhang et al., 2014). Merged reads were split in half and 20 bp deleted from each read at the overlapping ends using the fastx_toolkit v 0.0.13.2 (Hannon Laboratory, 2010). This resulted in mean read lengths of 72 bp. These ‘artificial’ non-overlapping read pairs were afterwards merged with the read set fraction that was non-overlapping.
Overall transposable element load
Request a detailed protocolA S. cerevisiae specific, curated and updated TE library that contained all consensus sequences of all TE families found in this species is available from Carr et al. (2012). With this library, we identified TE content and specific copy insertions in the W303 genome using RepeatMasker v4.02 (Smit et al., 2015) with parameters set to -nolow -gccalc -s -cutoff 200 -no_is -nolow -norna -gff -u -engine rmblast. For overall TE load estimates, the fraction of reads mapped to TEs out of total mappable reads was calculated. For this, the TE library was appended to the masked W303 genome and all reads for all strains and generations were mapped using BWA v0.7.13 with standard parameters (Li, 2013). For all strains, mean per-base coverage was checked with bedtools genomecov v2.26 (Quinlan and Hall, 2010), upon which the asexual strain sample 3D-90 was excluded from all further analyses, as coverage was lower than one-fold for this sample. Following this analysis, stat-reads from the PopoolationTE2 v1.10.04 program (Kofler et al., 2016) was utilised to extract the number of total mapped reads and reads mapped to TEs. For statistics, a permutation ANOVA with the formula lm(coverage ~generation*mode) was utilised; for details see github repository (Bast and Jaron, 2019, copy archived at https://github.com/elifesciences-publications/reproductive_mode_TE_dynamics).
Specific transposable element insertions
Request a detailed protocolTo detect specific reference (present in the reference genome) and non-reference TE insertions in all samples, the McClintock pipeline was utilised (Nelson et al., 2017). This pipeline combines six different, benchmarked programs in a standardised fashion. McClintock was run with the non-overlapping read set, the curated TE library, and the W303 assembly using default parameters. The nonredundant insertions output file per sub-program was collected. Next, we utilised a custom python script to collect all information on insertions detected by all different programs and counted insertions with evidence from different programs only once.
To identify full-length TEs and solo LTR insertions from the McClintock custom filtered output, we tagged insertions by length according to the typical TY TE properties found in S. cerevisiae (i.e. a full TE is a combination of internal sequence and two LTRs within a 500 bp range; solo LTRs are between 220 and 420 bp; see Supplementary file 1). Because TE insertion detection was influenced by the coverage, coverage was taken into account when calculating the number of insertions, by adding it as random factor (coverage effect p<0.001, generation effect p=0.006, reproductive mode effect p=0.033, and interaction between generation and mode p<0.001; permutation ANOVA with the formula lm(counts ~ coverage+generation*mode); for details see github repository, Bast and Jaron, 2019). We then calculated the number of lost TEs in asexual strains from the regression slope in asexuals after correcting for coverage (i.e. computing residuals) over 1000 generations, with 50 full-length TEs in the ancestor. To additionally check for a bias due to coverage differences between sexual and asexual strains, we randomly subsampled read data for each sample corresponding to the mean coverage of the asexual strains for each generation (Figure 1—figure supplement 2).
Modelling
Request a detailed protocolTo model TE dynamics in yeast we adjusted an individual based, forward in time simulator by Dolgin and Charlesworth (2006). We extended the model to include sexual cycles via fusion of two haploid individuals and recombination, with on average one crossover on each of the 16 modelled chromosomes (yeast has 16 chromosomes; Goffeau et al., 1996; McDonald et al., 2016). Each chromosome carries 200 loci that are potential targets for a TE insertion. A simulation is initiated with a single individual with 50 TEs randomly placed in the 3200 loci of the genome. The founder individual then populates clonally the whole simulated deme of explicitly simulated 100,000 individuals. With currently available computational resources, there was no need to scale deterministic parameters of the model as was done in the original study by Dolgin and Charlesworth (2006). To account for mutations during this phase, we ran 20 burn-in generations of transposition and excision cycles on every individual separately without applying selection. One generation in the simulation consists of a round of selection and reproduction with transposition occurring during reproduction, followed by excision. The relative fitness of an individual carrying TEs was modelled as , where and are parameters representing the strength of selection and the strength of epistatic interactions between TEs respectively (Dolgin and Charlesworth, 2006). The simulation was then continued for 990 generations. We performed 10 replicates of each simulation. Using the average TE load in the population measured every 10 generations, we fitted a linear model to estimate average TE loss across the ten replicates of each simulation. Parameters were derived from yeast experimental measurements and simulations were run with perturbation in the surrounding parametric space (see Supplementary file 2A). We further explored the effects of different transposition rates during meiosis vs asexual reproduction, but this did not change the dynamics even for meiotic transposition rates that were not biologically plausible (up to 10% of TEs transposing during meiosis). The last extension included the introduction of an unlinked, general modifier allele increasing the excision rates of all elements by the same amount. The parameters related to this extension are the initial frequency of the modifier allele and the excision rate increases when the modifier allele is present (see Supplementary file 2B). See the code documentation for details.
Data availability
Request a detailed protocolRaw read data of the experiment are available at SRA (BioProject identifier PRJNA308843).
Code availability
Request a detailed protocolThe code used for both the analyses of empirical data and for the theoretical prediction of TE dynamics together with explanations are available online at https://github.com/KamilSJaron/reproductive_mode_TE_dynamics (copy archived at https://github.com/elifesciences-publications/reproductive_mode_TE_dynamics).
Data availability
Raw read data of the experiment are available at SRA (BioProject identifier PRJNA308843). All data processing and analyses scripts as well as the simulator together with explanations are available at https://github.com/KamilSJaron/reproductive_mode_TE_dynamics (copy archived at https://github.com/elifesciences-publications/reproductive_mode_TE_dynamics).
-
NCBI BioProjectID PRJNA308843. Saccharomyces cerevisiae (baker's yeast).
References
-
Genetic and epigenetic changes involving (Retro)transposons in animal hybrids and polyploidsCytogenetic and Genome Research 140:295–311.https://doi.org/10.1159/000352069
-
No accumulation of transposable elements in asexual arthropodsMolecular Biology and Evolution 33:697–706.https://doi.org/10.1093/molbev/msv261
-
Trimmomatic: a flexible trimmer for illumina sequence dataBioinformatics 30:2114–2120.https://doi.org/10.1093/bioinformatics/btu170
-
BookGenes in Conflict: The Biology of Selfish Genetic ElementsCambridge: Belknapp Press.
-
The evolution of self-regulated transposition of transposable elementsGenetics 112:359–383.
-
Asexual evolution: do intragenomic parasites maintain sex?Molecular Ecology 21:3893–3895.https://doi.org/10.1111/j.1365-294X.2012.05638.x
-
Transposable elementsCurrent Opinion in Genetics & Development 2:861–867.https://doi.org/10.1016/S0959-437X(05)80108-X
-
Ty1 copy number dynamics in SaccharomycesGenetics 169:1845–1857.https://doi.org/10.1534/genetics.104.037317
-
FASTX-ToolkitFASTX-Toolkit, http://hannonlab.cshl.edu/fastx_toolkit/index.html.
-
Insertion polymorphisms of mobile genetic elements in sexual and asexual populations of Daphnia pulexGenome Biology and Evolution 9:evw302.https://doi.org/10.1093/gbe/evw302
-
PoPoolationTE2: comparative population genomics of transposable elements using Pool-SeqMolecular Biology and Evolution 33:2759–2764.https://doi.org/10.1093/molbev/msw137
-
Chromosome rearrangement by ectopic recombination in Drosophila melanogaster: genome structure and evolutionGenetics 129:1085–1098.
-
McClintock: an integrated pipeline for detecting transposable element insertions in Whole-Genome shotgun sequencing dataG3: Genes, Genomes, Genetics 7:2763–2778.https://doi.org/10.1534/g3.117.043893
-
Genetic drift, not life history or RNAi, determine Long-Term evolution of transposable elementsGenome Biology and Evolution 8:2964–2978.https://doi.org/10.1093/gbe/evw208
-
Genome evolution: sex and the transposable elementCurrent Biology 11:R296–R299.https://doi.org/10.1016/S0960-9822(01)00168-3
-
Sex and the spread of retrotransposon Ty3 in experimental populations of Saccharomyces cerevisiaeGenetics 143:1567-77.
Decision letter
-
Graham CoopReviewing Editor; University of California, Davis, United States
-
Diethard TautzSenior Editor; Max-Planck Institute for Evolutionary Biology, Germany
-
Graham CoopReviewer; University of California, Davis, United States
-
Brian CharlesworthReviewer; University of Edinburgh, United Kingdom
In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.
Thank you for submitting your article "Asexual reproduction drives the reduction of transposable element load" for consideration by eLife. Your article has been reviewed by three peer reviewers, including Graham Coop as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by Diethard Tautz as the Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Brian Charlesworth (Reviewer #2).
The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.
The reviewers and I appreciated the analysis and results. I have included the two reviewers' comments below.
Please respond point by point to the reviewer comments below. The major comments that definitely need addressing are:
1) Moderate the causal language concerning selection and increase in excision rate, the evidence for this is indirect at best. See reviewer 1.
2) Reviewer 2 raises a concern about how the non-independence of temporal samples from the same population replicate are dealt with. One suggestion from the reviewer/editor discussion was to just use the difference between the initial and end points. The authors more generally need to be clearer about the analysis performed, e.g. give the formulae for the linear models run.
3) Reviewer 2 raises some questions about full-length and non-reference TEs that need to be addressed.
4) The McDonald experiment is of pooled sequencing, thus you are averaging over population-frequencies of TE at each locus. While TE-calling from the short-read data, and asexuality, it might be very hard to do anything with the frequency of TEs at each locus I thought it would be helpful to more fully acknowledge that the current analysis (I think) confounds the number TE loci in the genome and the frequency of the TE at each locus.
Reviewer #2:
This paper presents an interesting analysis of the population dynamics of transposable elements (TEs) in long-term experimental populations of budding yeast, comparing sexual and asexual populations. The data are analysed in the light of simulations, which extend an earlier study by Dolgin and Charlesworth, 2007, to match more closely the properties of the yeast populations. The overall conclusion is that excisions of the LTR retrotransposons (presumably by recombination between the LTRs; this could have been investigated) cause a decline in TE copy number in the asexuals, whereas the sexuals seem to stay more or less in equilibrium.
This is one of the few studies of this type of problem, and illustrates the difficulty of making generalisations about the fates of TEs in populations with different mating systems, as the authors make clear. I have to take their bioinformatic analyses on trust, as I lack the relevant expertise, but they seem to have done a thorough job of these. Overall, this is a nice paper.
My only criticism is that they emphasis the possible role of selection for an increase in excision rate in explaining their results, but I could not see that they have presented any solid evidence that this has actually happened. In an asexual population, all kinds of hitchhiking will be going on, so any increase in frequency of a modifier of excision rate could simply be due to such an effect, although of course one would not expect consistency across replicate populations. They need to make it clearer what the evidence for such selection actually is; it's not obvious to me that there has been an increase in excision rates.
Reviewer #3:
Bast et al. address how sexual reproduction affects transposable element (TE) accumulation in paired sexual and asexual lineages of yeast. As noted in the manuscript, much of the literature surrounding the issue of the impact of sex on TE accumulation is confounded by deeper evolutionary timescales, different effective population sizes, and changes in mating system. Utilizing existing data of yeast experimental evolution lines to get around these issues, Bast et al. find that TEs are spread through sex, and driven towards extinction in asex lines. The manuscript is clearly written and easy to follow. Specific comments below.
My understanding from a brief glance at the McDonald et al., 2016 paper these data come from suggests that the data represent eight separate lineages, such that each point should be connected through the time series in Figures 1 and 2. I'm not familiar enough with the statistics involved, but if each is an independent lineage through time, does this need to be included in analyses? It also seems that some of the uncertainty in TE genotyping could be addressed using replicate lineages – if a full-length copy is 'excised,' we don't expect to see it again in later generations of that replicate.
- Main text, fifth paragraph: Throughout the manuscript, the use of the term 'excision' might confuse readers more familiar with different types of TEs. For DNA transposons (cut and paste), this refers to a complete or near complete removal via transposition. In this manuscript, 'excision' is used for solo LTR formation via unequal recombination. To avoid confusion, perhaps 'solo LTR formation' could be used as an alternative to excision.
- Main text, fifth paragraph: It would be useful to add that solo LTR formation (excision) removes the protein coding genes that allow transposition, to make it clear why we care about full-length copies.
- Main text, sixth paragraph: I am a bit confused on the numbers of full length TEs identified. The second to last sentence states only 24 of 50 full length copies can be detected, but the last sentence states asex decreases to 41 full length copies. Although identifying TEs is really difficult, I am concerned that there could be deletions or excisions in these 26 non-assayable copies that overwhelm the signal observed. Could a total count of transposition-competent copies be tracked through time by something akin to the first coverage based approach (like Figure 1—figure supplement 1), using internal protein coding regions? Or is there a way to explain what's happening with these non-assayable copies? Are they always the same copies in every individual?
Doesn't a constant number of non-reference copies through time (Figure 2B) mean that there is an increased transposition rate in asex through time? To me it feels like since both the asex full length copies (presumably the active copies) and reference copies are being removed through time, this means the per-element transposition rate is going up. Again, this could be addressed by identifying how many non-reference TEs are the same non-reference TE copies (at the same loci) through time (although maybe the low coverage of some samples precludes this?). But it's hard for me to think in residuals, so maybe this isn't impacting things, and the slightly more negative trend in asex in 2B reflects the effect of a higher excision rate on generating fewer new copies.
- Subsection “Modelling”: I need a little more information on the TE annotation. How are full length copies being defined from the RepeatMasker output? Those which contain any internal protein coding sequence? A length cutoff relative to the reference db?
https://doi.org/10.7554/eLife.48548.015Author response
Please respond point by point to the reviewer comments below. The major comments that definitely need addressing are:
1) Moderate the causal language concerning selection and increase in excision rate, the evidence for this is indirect at best. See reviewer 1.
We changed the language accordingly throughout the manuscript.
2) Reviewer 2 raises a concern about how the non-independence of temporal samples from the same population replicate are dealt with. One suggestion from the reviewer/editor discussion was to just use the difference between the initial and end points. The authors more generally need to be clearer about the analysis performed, e.g. give the formulae for the linear models run.
The strain replication is taken into account in our model, as we included generation in the permutation ANOVA. We clarified this in the text and the formulae are now given in the manuscript. Additionally, we calculated the statistics for start and end points. For details, please see concern one of reviewer three below.
3) Reviewer 2 raises some questions about full-length and non-reference TEs that need to be addressed.
Detailed replies to these two questions are given at the reviewer’s comments below.
4) The McDonald experiment is of pooled sequencing, thus you are averaging over population-frequencies of TE at each locus. While TE-calling from the short-read data, and asexuality, it might be very hard to do anything with the frequency of TEs at each locus I thought it would be helpful to more fully acknowledge that the current analysis (I think) confounds the number TE loci in the genome and the frequency of the TE at each locus.
We added clarifications to the manuscript. See also the detailed reply to this comment at the section of reviewer three.
Thank you again for your constructive comments.
Reviewer #2:
[…] My only criticism is that they emphasis the possible role of selection for an increase in excision rate in explaining their results, but I could not see that they have presented any solid evidence that this has actually happened. In an asexual population, all kinds of hitchhiking will be going on, so any increase in frequency of a modifier of excision rate could simply be due to such an effect, although of course one would not expect consistency across replicate populations. They need to make it clearer what the evidence for such selection actually is; it's not obvious to me that there has been an increase in excision rates.
We rephrased the wording accordingly throughout the manuscript to avoid suggesting a strong implication of selection.
Reviewer #3:
[…] My understanding from a brief glance at the McDonald et al., 2016 paper these data come from suggests that the data represent eight separate lineages, such that each point should be connected through the time series in Figures 1 and 2. I'm not familiar enough with the statistics involved, but if each is an independent lineage through time, does this need to be included in analyses? It also seems that some of the uncertainty in TE genotyping could be addressed using replicate lineages – if a full-length copy is 'excised,' we don't expect to see it again in later generations of that replicate.
The strain replication is taken into account in our model, as we included generation in the permutation ANOVA, and there is only one data point per strain per time. We focused on the generation effect, because we are interested in the temporal dynamics of TE loads. We did not separate the strains in the graphics to correspond with the underlying statistics.
We clarified this now by adding the ANOVA formula to the Materials and methods, which was previously only shown in the github repository. See the last paragraph of the subsection “Specific transposable element insertions”.
Additionally, to address the issue, we calculated statistics of a paired-test of the start (generation 90) and end-points (generation 990) per lineage/strain: Sexuals P = 0.84; Asexuals P = 0.03.
- Main text, fifth paragraph: Throughout the manuscript, the use of the term 'excision' might confuse readers more familiar with different types of TEs. For DNA transposons (cut and paste), this refers to a complete or near complete removal via transposition. In this manuscript, 'excision' is used for solo LTR formation via unequal recombination. To avoid confusion, perhaps 'solo LTR formation' could be used as an alternative to excision.
Indeed, this might be confusing. We added clarifications to the Introduction (main text, fifth paragraph).
- Main text, fifth paragraph: It would be useful to add that solo LTR formation (excision) removes the protein coding genes that allow transposition, to make it clear why we care about full-length copies.
We added the clarifications (main text, fifth paragraph). Thank you for helping to make the manuscript more clear.
- Main text, sixth paragraph: I am a bit confused on the numbers of full length TEs identified. The second to last sentence states only 24 of 50 full length copies can be detected, but the last sentence states asex decreases to 41 full length copies. Although identifying TEs is really difficult, I am concerned that there could be deletions or excisions in these 26 non-assayable copies that overwhelm the signal observed. Could a total count of transposition-competent copies be tracked through time by something akin to the first coverage based approach (like Figure 1—figure supplement 1), using internal protein coding regions? Or is there a way to explain what's happening with these non-assayable copies? Are they always the same copies in every individual?
We know that the ancestral reference strain has 50 full length copies. With the insertion detection method (McClintock), we can only identify 24 in the starting strains (generation 0, which should be about 50). These are not necessarily the same in the following generations. This is because with illumina data it is very hard to identify all insertions and not possible to track them through time. This is why we take the slope of the linear regression to estimate the loss from the original 50 copies. Like this, we can compare it to the simulations.
Additionally, our first approach estimates total TE content, using the fraction of mappable reads that map to the TE copies (including both internal coding and flanking LTR regions), is independent of the need to detect specific insertions and gives very similar results (slopes; Figure 1—figure supplement 1). This approach does not allow for discriminating specific insertions or truncated TE copies (which can still include internal coding regions). However, this approach is representative of mostly losing internal regions, as both internal and LTR were included in the TE library, such that loss of internal regions should make a bigger difference (solo LTRs still covered by reads).
Doesn't a constant number of non-reference copies through time (Figure 2B) mean that there is an increased transposition rate in asex through time? To me it feels like since both the asex full length copies (presumably the active copies) and reference copies are being removed through time, this means the per-element transposition rate is going up. Again, this could be addressed by identifying how many non-reference TEs are the same non-reference TE copies (at the same loci) through time (although maybe the low coverage of some samples precludes this?). But it's hard for me to think in residuals, so maybe this isn't impacting things, and the slightly more negative trend in asex in 2b reflects the effect of a higher excision rate on generating fewer new copies.
These are very interesting questions, but unfortunately, we cannot disentangle whether there is faster TE turnover in each sexual genomic background, or the same TE insertions are maintained through time with the data available in the study. This is because pooled sequencing lowers the probability of TE detection in specific individuals, such that we cannot analyse turnover or activity of specific elements in specific genomic backgrounds. What we analysed is the presence of TE insertions at a given generation, meaning they are present in enough genotypes in the population to be picked up in relatively low-coverage illumina sequencing. We clarified this in the Results (main text, sixth paragraph).
- Subsection “Modelling”: I need a little more information on the TE annotation. How are full length copies being defined from the RepeatMasker output? Those which contain any internal protein coding sequence? A length cutoff relative to the reference db?
We identified full-length LTR elements based on the TE insertion length information from the mcClintock output, that was filtered by a custom script to collect all information from the various McClintock output files. These include both internal coding sequences and flanking LTR regions. See subsection “Specific transposable element insertions”, last paragraph.
https://doi.org/10.7554/eLife.48548.016Article and author information
Author details
Funding
Deutsche Forschungsgemeinschaft (BA 5800/1-1)
- Jens Bast
Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (PP00P3_17062)
- Tanja Schwander
Schweizerischer Nationalfonds zur Förderung der Wissenschaftlichen Forschung (PP00P3_139013)
- Tanja Schwander
Deutsche Forschungsgemeinschaft (BA 5800/2-1)
- Jens Bast
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
Acknowledgements
We thank Michael J McDonald, Daniel P Rice and Michael M Desai for providing the experimental evolution raw data and for helpful explanations. We further thank Patrick Tran Van for setting up the insertion pipeline, Daniel L Jeffries for providing the TE wrapper script, Beatriz Navarro Dominguez for improving the empirical analyses R script and Deborah Charlesworth, Brian Charlesworth, Graham Coop and Laurent Keller for discussions and comments on the manuscript. This study was supported by DFG research fellowships (grant numbers BA 5800/1–1 and BA 5800/2–1 to JB) and by funding from the University of Lausanne and Swiss SNF (grant numbers PP00P3_170627 and PP00P3_139013 to TS).
Senior Editor
- Diethard Tautz, Max-Planck Institute for Evolutionary Biology, Germany
Reviewing Editor
- Graham Coop, University of California, Davis, United States
Reviewers
- Graham Coop, University of California, Davis, United States
- Brian Charlesworth, University of Edinburgh, United Kingdom
Publication history
- Received: May 17, 2019
- Accepted: September 4, 2019
- Accepted Manuscript published: September 5, 2019 (version 1)
- Version of Record published: October 8, 2019 (version 2)
Copyright
© 2019, Bast et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 2,564
- Page views
-
- 293
- Downloads
-
- 26
- Citations
Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Evolutionary Biology
- Genetics and Genomics
Functionally indispensable genes are likely to be retained and otherwise to be lost during evolution. This evolutionary fate of a gene can also be affected by factors independent of gene dispensability, including the mutability of genomic positions, but such features have not been examined well. To uncover the genomic features associated with gene loss, we investigated the characteristics of genomic regions where genes have been independently lost in multiple lineages. With a comprehensive scan of gene phylogenies of vertebrates with a careful inspection of evolutionary gene losses, we identified 813 human genes whose orthologs were lost in multiple mammalian lineages: designated ‘elusive genes.’ These elusive genes were located in genomic regions with rapid nucleotide substitution, high GC content, and high gene density. A comparison of the orthologous regions of such elusive genes across vertebrates revealed that these features had been established before the radiation of the extant vertebrates approximately 500 million years ago. The association of human elusive genes with transcriptomic and epigenomic characteristics illuminated that the genomic regions containing such genes were subject to repressive transcriptional regulation. Thus, the heterogeneous genomic features driving gene fates toward loss have been in place and may sometimes have relaxed the functional indispensability of such genes. This study sheds light on the complex interplay between gene function and local genomic properties in shaping gene evolution that has persisted since the vertebrate ancestor.
-
- Evolutionary Biology
- Plant Biology
While the domestication process has been investigated in many crops, the detailed route of cultivation range expansion and factors governing this process received relatively little attention. Here using mungbean (Vigna radiata var. radiata) as a test case, we investigated the genomes of more than one thousand accessions to illustrate climatic adaptation’s role in dictating the unique routes of cultivation range expansion. Despite the geographical proximity between South and Central Asia, genetic evidence suggests mungbean cultivation first spread from South Asia to Southeast, East, and finally reached Central Asia. Combining evidence from demographic inference, climatic niche modeling, plant morphology, and records from ancient Chinese sources, we showed that the specific route was shaped by the unique combinations of climatic constraints and farmer practices across Asia, which imposed divergent selection favoring higher yield in the south but short-season and more drought-tolerant accessions in the north. Our results suggest that mungbean did not radiate from the domestication center as expected purely under human activity, but instead the spread of mungbean cultivation is highly constrained by climatic adaptation, echoing the idea that human commensals are more difficult to spread through the south-north axis of continents.