Introduction

Modifications of DNA and histones represent epigenetic marks associated with active or repressed gene expression and play an important role in plant development and responsiveness to the environment. Unlike in animals, plant germlines are not set aside early in development but form later when somatic cells get committed to form gametes (1). As a result, epigenetic marks affected by environmental conditions can potentially be inherited. Such marks may thus contribute to adaptive responses in changing environments. Furthermore, spontaneous changes in DNA-methylation, a mark associated with some naturally occurring heritable epialleles in the plant model Arabidopsis thaliana (e.g., 2, 3, 4), occur much more frequently than genetic mutations (5, 6, 7). This suggests that the rates of epimutations might be sufficiently high to differentiate epigenetic from genetic variation, yet low enough to maintain a heritable response to selection.

The potential role of epigenetic variation in adaptation depends on several factors: the rate of spontaneous epimutations, the influence of environmental effects on epigenetic variation, and the impact of epigenetic variation on the phenotype. Additionally, for epigenetic variation to contribute to adaptation, it must be sufficiently stable, i.e., heritable over several generations such that it can be selected upon.

In non-model species, DNA methylation patterns assessed with reduced representation sequencing methods were frequently found to be associated with distinct environments or treatments. However, epigenetic variation was generally correlated to genetic variation (8, 9, 10, 11, 12). Likewise, high-resolution genome-wide studies in Arabidopsis have revealed extensive variation in DNA methylation patterns among different populations but was mostly associated with underlying genetic differences (13, 14). Nonetheless, studies on epigenetic recombinant inbred lines (epiRILs) suggest that epialleles can significantly contribute to phenotypic variation in plants (15, 16). Thus, while there are examples of pure epialleles (17) whose frequency is not related to genetic variation, the role of spontaneous epigenetic variation in adaptation remains controversial (18, 19, 20, 6). This is largely because in most species it is difficult to distinguish epigenetic variation that is a consequence of genetic differences from pure epigenetic variation that is independent of the genotype (21). Disentangling genetic and epigenetic effects bears most promise in species that are clonally reproducing, such as dandelions or strawberries (22, 23), or that are highly inbred, such as Arabidopsis.

In a previous experiment with Arabidopsis (4), we simulated selection in a rapidly changing environment. We compared phenotypic traits and epigenetic variation between populations grown for five generations under selection with their genetically identical ancestors in a common environment. Selected populations of two distinct genotypes exhibited significantly different flowering times and plant architectures when compared with their ancestral populations, and these differences persisted for at least three generations in a common environment. In parallel, we observed a reduction in epigenetic diversity and changes in DNA methylation patterns, some of which were associated with the observed phenotypic changes. These findings indicated that epigenetic variation was subject to selection. The genotypes used in that experiment were recombinant inbred lines (RILs), formed by hybridizing two different accessions followed by several generations of inbreeding. Therefore, most of the epigenetic variation in these RILs was likely retained as standing variation from this original hybridization event (24) and not accumulated from spontaneously occurring or environmentally induced epimutations during the selection experiment (4).

Here, we used a similar approach using Arabidopsis accessions from a previous selection experiment that investigated the effects of aphid herbivores on plant defensive traits to assess the potential for epigenetic differentiation between selection treatments (25). In the selection experiment, mixtures of genetically uniform Arabidopsis accessions were grown for five generations in either insect free (control) or inoculated cages with different single aphid species or their mixture. After five generations, genetic variation was significantly reduced, and populations were dominated by three accessions to various degrees (25). To examine whether epigenetic variation was also selected within these three accessions (Ma-0, Wei-0, Wil-3), we compared phenotypic traits of selected populations to (i) the original founder population, (ii) the control population, and (iii) between two single-species-aphid treatments. From each selection treatment, we used 6 independent replicate populations after five generations of selection and grew them together for two generations in a randomized planting pattern in a common environment. To minimize confounding maternal effects, we assessed the phenotypes only in the second generation. We further compared genome-wide DNA methylation states in a subset of mother plants used to set up the common environment experiment.

We designed the experiment to detect effect sizes of 10% or more at the significance level of P < 0.05 with a statistical power beta > 0.80. While we found highly significant differences between accessions, replicate populations, and across five experimental blocks (with plants generally growing worse in two blocks compared to the other three), we observed only few significant differences between selection regimes (control and the two different aphid treatments). These differences were weak, with the percentage sum of squares explained by the contrasts being less than 1% and P values ranging between 0.05 and 0.01. We found minimal differences in DNA methylation patterns within accessions between plants from different selection regimes, with 0.009% of all tested cytosines showing significant differences after correction for multiple testing (FDR < 0.05). In contrast to our previous study (4), there was no reduction in the amount of epigenetic variation in response to selection. In summary, we found evidence that some within-accession differences caused by the selection regimes persisted for two generations in the common environment. However, the extent of these differences was low. Thus, we conclude that evidence for selection from epigenetic variation in this experiment was weak.

In strong contrast to our findings, a recent study claimed that environment-induced heritable variation is common in Arabidopsis (26). We were surprised about this finding because, in our experiment, the aphid treatments had a strong impact on plant fitness and survival but not on epigenetic differentiation. Thus, if environment-induced heritable variation were indeed so common in this model species, we would have expected to observe it, too. We noticed that the design of the study by Lin and colleagues (26) was elegant but susceptible to potential errors that confounded treatments with genetic variation at the beginning of the experiment. We used their RNA-seq data to assess whether the analyzed plants displayed any evidence of residual genetic variation that was potentially confounded with the treatments. Our analysis showed that about 10% of the plants belonged to accessions different from the ones they had been assigned to. This error did not occur at random but stemmed from a mix-up of two selection treatments. Thus, the large effects assigned to environment-induced heritable variation were very likely due to genetic differences between accessions.

In summary, while epialleles can be subject to selection and contribute to adaptation (4), it remains unclear how common this is, and epigenetic differentiation in response to selection seems to require standing epigenetic variation within accessions. In contrast, evidence for a role of environment-induced epigenetic changes in adaptation in the model plant Arabidopsis remains weak.

Results

Impact of Selection Treatments on Phenotypic Traits

As described above, we had grown mixtures of genetically uniform Arabidopsis accessions for five generations under control conditions or inoculated with different aphid species. After five generations, the populations were dominated by the three accessions Ma-0, Wei-0, and Wil-3. To investigate whether the selection treatments led to phenotypic differences also within these uniform genetic backgrounds, indicative of selecting epigenetic variation, we tested whether the selection treatments had an impact on the time to bolting, the rosette diameter at bolting, the number of rosette leaves at bolting, the leaf blade area, and the trichome density (Tables 1-2, Figures 1-3). Besides the selection treatment and its contrasts, the model also included the experimental blocks (n = 5), the accessions (n = 3), the populations (n = 18+3 ancestrals), and maternal lines (“mother”, N = 261), along with interactions of the selection treatment contrasts with accessions. Additionally, we conducted analyses without the Wil-3 accession, which had low replication at the level of maternal lines and ran separate analyses for each accession. Additional models and power estimations can be found in the Supplementary Information (Supplementary data S1 and Supplementary data S2).

Schematic of the original experimental design (top) and the current study (bottom).

Percent sum of squares explained by model terms and P-values using all three accessions.

Contrasts of a term are marked with indents. Complete output and further results for individual accessions are given in the supplemental data.

Percent sum of squares explained by model terms and P-values using only Ma-0 and Wei-0 accessions.

Contrasts of a term are marked with indents. Complete output and further results for individual accessions are given in the supplemental data.

The term “block”, especially the comparison between blocks 1–3 and blocks 4–5, was highly significant in all analyses and generally explained most of the total phenotypic variation (up to nearly 50% for leaf blade area, Table 2). The time to bolting was an exception in this respect because here, the term “accession” explained most of the variation (23.23%), followed by “population” (8.82%), “mother” (8.11%), and “block” (6.55%; the contrast comparing blocks 1–3 with blocks 4–5 only explaining 0.51% of the variation in time to bolting). Besides the term “block”, the terms “accession” and “population” were consistently significant and explained a large fraction of the variation. Additionally, the term “mother” was occasionally significant, accounting for 5 to 10% of the total phenotypic variance (Tables 12). In contrast, the selection treatment and its contrasts were rarely significant. If significant, the term “selection treatment” explained less than 1% of the total phenotypic variation. When analyzing all data, rosette diameter at bolting was significantly different between the control and aphid treatments, and between the two aphid treatments (Table 1). These differences were driven by the accession Wil-3 with low replication at the level of maternal lines (Table 3, Figures 23). If we excluded Wil-3, only the interaction between the aphid treatment contrast and the remaining two accessions remained significant for the phenotype rosette diameter (Table 2). Separate analyses for each accession revealed significant differences in bolting time between the control and the aphid treatments in Ma-0 (Figures 23). We also observed a significant difference in trichome densities between the two aphid treatments in Wei-0; however, these were driven by some extreme values (Figure 3). We found several significant differences between selection treatments in Wil-3 (Figures 23), but, again, considering the low level of replication at the level of maternal lines in this accession, these differences should be interpreted with caution.

Phenotype measures in the G7 generation averaged by the ID of the maternal lines from the G6 generation.

P-values for contrasts with P < 0.05 are indicated (based on the analyses done per accession). P-values with asterisk were not significant (P < 0.05) when outliers were removed (see Figure 3).

Phenotypes measured in the G7 generation.

Brown and green colors represent plants from blocks 1–3 and 4–5, respectively (slower growth in blocks 4– 5). Dots with black strokes were identified as outliers. The number of plants is the minimal number of plants. It varies by 2–3 plants per accession and treatment as some measures were missing. P-values for contrasts with P < 0.05 are indicated (based on the analyses done per accession). P-values with asterisk were not significant (P < 0.05) when outliers were removed.

Gene Ontology enrichment among genes mapped by DMCs.

“Found” indicates the number of different genes with the term that were mapped by DMCs. “Expected” refers to the number of genes that were expected to be mapped by DMCs if DMCs were randomly distributed.

Limited Impact of Aphid Treatments on DNA Methylation Patterns and Epigenetic Variation

To assess whether epigenetic variation could be responsible for the phenotypic differences observed between ancestral and selection treatments, we characterized changes in the genome-wide methylomes of a subset of plants, taking DNA methylation as a proxy for epigenetic variation. This was done with a subset of Wei-0 maternal lines from generation G6. Thus, different from the phenotypic results obtained from plants in generation G7, these DNA methylation data are a more direct readout of the aphid treatment (less random changes because one generation earlier, but potentially including stronger maternal effects). We compared DNA methylation patterns of ancestral and selection-treated plants (n = 34) by whole-genome bisulfite sequencing. In total, we tested 3,573,936 cytosines for differential methylation between any of the three groups or between ancestral and the two aphid-selection treatments. The number of significantly differentially methylated cytosines was similar in all comparisons: from 101 when the two aphid treatments were compared with each other to 131 when comparing the L. erysimi treatment with the ancestral treatment. Overall, about 0.009% of all tested cytosines were significantly different and they were dispersed across the entire genome (Figure 4A). If selection had acted on specific DNA methylation epigenotypes, one would expect a reduction in the variation of DNA methylation (c.f. 4). Thus, we also tested whether the extent of epigenetic variation changed during the selection experiment. We did not find any evidence of a reduction or expansion of variation in DNA methylation in any context (Figure 4B).

a) Location of the differentially methylated cytosines along the genome. b) Epigenetic variation within different lines (selection replicates), i.e., distances to each centroid. Only lines with at least four individuals are shown. No overall and no pairwise P-value was smaller than 0.05.

To understand the potential effects of differential cytosine methylation, we combined all differentially methylated cytosines (DMCs) from the three comparisons (n = 308) and characterized them with respect to sequence context and position relative to genes. Among the 308 DMCs, the majority was in the CG context (87.6%) and only few were found in the CHG and CHH contexts (6.2% each). Considering all cytosines in the Arabidopsis genome, about 13%, 15%, and 72% are found in the CG, CHG, and CHH contexts, respectively (4). Thus, there was a clear enrichment of DMCs in the CG context, similar to what we found in the previous selection experiment (4) and consistent with the fact that methylation in the CG context shows the highest variability (27). Almost all DMCs were located within or less than 2 kb away from genes (87.3%) or genes and transposable elements (98.1%). We tested whether genes with certain functions were enriched for DMCs using gene ontology (GO) enrichment analysis (Table 3). There was an enrichment of several terms related to epigenetic regulation: “histone methylation”, “DNA methylation”, “chromatin silencing”, and “RNA interference”. Likewise, several terms were associated with post-translational protein modifications: “post-translational protein modification”, “protein modification by small protein removal” and “protein glycosylation” (Table 3). The remaining terms could be associated with transport and growth in general, for example “positive regulation of cell proliferation”, “secondary cell wall biogenesis”, and “gravitropism”. In conclusion, although we identified only a few DMCs, they were preferentially found in genes associated with epigenetic regulation, post-translational protein modifications, and transport and growth processes. However, the overall limited differentiation and the absence of reduced variation in DNA methylation suggest that there was only weak selection of specific DNA methylation sites.

The Large Effects Observed in a Recent Study Are not Due to Environment-Induced Heritable Variation but to Genetic Differences between Accessions

In contrast to our study, which found only weak evidence for the selection of epigenetic variation, a recent report claimed that environment-induced heritable variation is common in Arabidopsis (26). We wanted to investigate this surprising difference in the outcomes the two experiments, which both aimed at investigating the potential role of epigenetic variation in adaptation in this model species. The aphid treatments we used had a strong impact on fitness (25), such that we would have expected to find correspondingly strong evidence for the heritability of environment-induced traits if this were indeed as common as described by Lin and colleagues (26).

We realized that in this recent study (26), potential errors may have confounded treatments with genetic variation. This is because in that study, Lin and colleagues kept lineages 1-to-1 throughout the experiment by single-seed descent. Since the authors provided RNA-seq raw data of 190 lineages (from a total of 4,032), we could use these data to assess genetic variation among them. We tested whether the subset of 190 plants displayed any evidence of residual genetic variation confounding the treatments within the different accessions used (26). Their RNA-seq data supposedly included four accessions. We compared pairwise genetic distances and identified the accessions most closely related to the experimental plants. We used 10,000 randomly drawn genes for the analyses and were left with up to 13,418 variants after filtering for coverage between 10–1,000 in at least four plants per treatment and accession. Visualizing pairwise genetic distances between all plants revealed that two presumed accessions, Abd-0 and TRE-1, were each represented by two genetically distinct clusters (Figure 5A). When we visualized each accession separately, highlighting the treatments, we found that the genetically distinct groups had been associated with different selection treatments. Specifically, the high Cd treatment in presumed Abd-0 and the drought treatment in presumed TRE-1 were likely other genotypes (Figure 5B). When we compared the genomes of these experimental plants to those of publicly available accessions, we could match the genetically distinct plants from the high Cd treatment in Abd-0 to the Ws-2 accession. Likewise, the genetically distinct plants from the drought treatment in TRE-1 matched the Fei-0 accession (Supplementary Data S3). Notably, when we assigned the experimental plants to publicly available accessions, the distances to the closest accessions were between 0.0003 and 0.0034 (0.0019 on average). In strong contrast, when the genetically distinct plants that matched to Ws-2 or Fei-0 were compared to the accession they were originally thought to belong to, the distances ranged from 0.358 to 0.451 (0.414 on average). Hence, misassignments due to random errors were highly unlikely.

a) Genetic distances between the 190 RNA-seq samples from Lin and colleagues (26). The two genetic clusters diverging from the rest of their accession group are highlighted. b) Within accession-label genetic distances, drawn to scale. Maximal distance given below the label, colored according to treatment and generation. The two treatments that have been assigned the wrong accessions are highlighted. c) Pairwise genetic distances within lines (plants of the same lineage, i.e., single-seed descendants), within accessions, and between accessions using either the original accession labels from Lin and colleagues (left, 26) or the accessions assigned due to genetic similarity to the data available at 1001genomes.org (right). The red line demarcates the distance equal to the average of within-line distances plus three standard deviations.

To confirm our finding of a potential mix-up, we used either the original or the newly assigned accession labels to split the pairwise distances between experimental plants into three classes: (i) within lineage (i.e., (grand)-parent-to-offspring), (ii) within accessions, and (iii) between accessions. The former two should both yield values close to zero as plants within lineages and within accessions should be genetically (nearly) identical to each other. In contrast, distances between accessions can be substantial (Figure 5C). This expectation was not met with the original accession labels, for which we found large distances in all three classes. In contrast, the same evaluation with the re-assigned accession labels Ws-2 (for Abd-0 in High Cd treatment) and Fei-0 (for TRE-1 in drought treatment) confirmed our expectation. Thus, two treatments and accessions were mixed up with other accessions that were also part of the experiment.

In summary, our analysis reassigned 20 out of 190 plants (∼ 10.5%) to new accessions. This error did not occur at random but stemmed from the mix-up of two accessions with two treatments. This explains the large effects assigned to environment-induced heritable variation by Lin and colleagues (26) that, however, are in fact due to genetic differences.

Discussion

In this study, we investigated whether selection by different aphid species within genetically uniform Arabidopsis accessions can lead to heritable phenotypic variation that is stable in the absence of selection for at least two generations, indicative of adaptation. Such (meta)stable phenotypic variation is expected to be predominantly epigenetic in nature, defined as a mitotically and/or meiotically heritable change in gene expression without a change in DNA sequence (28). This is because Arabidopsis is highly inbreeding, such that genetic variation within an accession is expected to be extremely rare, while epigenetic differences, e.g., in DNA methylation, are much more common (29, 30). We investigated epigenetic variation at multiple levels: differences between treatments would indicate consistent selection of certain phenotypes, while differences between replicate populations could arise from either standing (epi)genetic variation at the beginning of the selection experiment or from de novo gained (epi)genetic variation developed during the experiment.

In summary, we found only weak evidence for epigenetic differentiation within accessions due to selection treatments, expected to manifest as significant phenotypic differences in the second generation in a common environment or as clear differences in the methylomes analyzed in our experiment. Except for one case, differences were limited to one accession with low replication at the level of mother plants or depended on the presence of a few extreme measurements. While differences between experimental blocks often explained the largest amount of the observed variation in phenotypes, this was different in the case of time to bolting. This is in line with the notion that flowering time in Arabidopsis has a strong genetic component (31). Similarly, we consistently observed strong differences between the accessions, again emphasizing the genetic component of trait variability in this model species.

The significant phenotypic differences in the common environment among populations within accessions and selection treatments and among maternal lines within populations were remarkable, because these should reflect epigenetic (or maternal effect) differences rather than genetic differences if accessions were genetically uniform (33). Especially the differences among populations within accessions and selection treatments, maintained for two generations in the common environment, were highly significant, even in comparison with the variation between accessions (Table 1 and 2). There are two possible explanations for these population effects: (i) initial genetic or epigenetic differences between populations that resulted in founder effects and/or (ii) differential (but selection treatment-independent) gain of genetic or epigenetic variation between replicate populations during the five generations of the selection experiment.

Initial genetic (or epigenetic) differences between populations may have two reasons: (i) seeds of single accessions obtained from NASC were not derived from homozygous, genetically uniform plants, or (ii) genotyping errors, occurring during the identification of plants representing the accessions of interest, similar to what we found for the study by Lin and colleagues (26). In that study, two presumed accessions, Abd-0 and TRE-1, showed strong evidence of environment-induced heritable variation while the two others, Ang-0 and Tol-0, did not. Plants of the two accessions showing no effect were genetically very similar to each other and were found identical to their accessions using publicly available data on over 1,000 accessions (1001genomes.org). However, plants from the presumed accessions with strong effects, Abd-0 and TRE-1, showed strong genetic differences between treatments, which was found to be due to a mix-up of these accessions while preparing or conducting the experiment. Both cases could explain differences in initial genetic composition between replicate populations in our study.

Differential gain of heritable variation between populations may include both, genetic and epigenetic variation. Potential bottlenecks during the selection experiment may increase the chances for differential drift, resulting in (epi)genetically distinct plants within each accession, population within accession and selection treatment, and maternal line within population. Even though epigenetic mutation rates are much higher than genetic mutation rates (5), genetic mutations may also play some role. Some genetic changes, such as changes in repeat length, occur much more frequently than nucleotide substitutions, of the latter occurring, in average, only one per genome and generation (34). Finally, it is conceivable that rare crossing events between accessions may also have differentially biased the genetic composition of populations.

While experimental errors can only be addressed by verifying the material’s identity at each step (which can be costly and challenging), residual genetic variation and newly acquired (epi)genetic variation could be experimentally distinguished in certain setups. If plants are split up into populations that are maintained throughout the experiment, it is important to grow and phenotype them in a common environment both at the beginning and the end of the experiment. If phenotypic differences between replicate populations already occur before imposing the selection treatments, it would suggest that the populations are (epi)genetically distinct from each other. In contrast, differences between replicate populations after treatment that are maintained in a common environment may indicate that variation was gained during the treatment. We note that differences between replicate populations due to founder effects or drift may be as relevant regarding evolutionary processes as consistent differences between selection treatments (35). While the latter indicate that there was selection and that it was directional, the former suggests that the formation of population replicates for the selection treatments increased phenotypic variation that has not yet been selected.

We found weaker evidence for selection of epigenetic variation in this Arabidopsis experiment than we did in our previous study, in which we experimentally simulated selection in a rapidly changing environment (4). The reason for this difference may be that, in the previous study, we used recombinant inbred lines (RILs) that were generated by crossing two different accessions followed by several generations of inbreeding. We hypothesized that the epialleles differing between selected and ancestral plants persisted for multiple generations in a state that did not match the underlying genotype, but finally converged to it. This was impossible in the present study because we used distinct accessions. In our previous study (4), we also noted some residual heterozygosity in one of the RILs, which might have contributed to our observations. Studies on a wide range of plant species that included genetically non-uniform populations consistently reported a significant correlation between genetic and epigenetic variation (8, 9, 10, 11, 12). To the best of our knowledge, there are very few studies in natural systems that demonstrated genetically independent, heritable — that is true epigenetic — variation in the absence of environmental triggers or after accounting for maternal effects. For example, maternal effects may explain the differences in flowering time between genetically identical apomictic dandelions collected in the field (36). One example for inheritance of true epigenetic variation has recently been conducted on clonally reproducing Lemna minor (37). The authors report an epigenetic memory effect of heat treatments with changes in DNA methylation in the CHG context. However, this might be due to differences in mitotic (i.e., clonal) vs. meiotic inheritance. While it has long been known that certain states are faithfully inherited during mitosis (38, 39), recent evidence shows that meiosis is accompanied by epigenetic reprogramming in plants (40, 41, 42, 43, 44, 45). Thus, it may be interesting to study epigenetic inheritance in species that generally reproduce clonally through mitosis but distinguish between vegetative reproduction through multicellular meristems and apomictic reproduction through an unreduced, unrecombined egg cell (1, 46, 47).

Methods

Experimental background and material from multigenerational selection experiment

The selection experiment from which the plants used in this study were derived, was fully described by Züst and colleagues (25). In brief, the experiment began with a collection of 27 Arabidopsis accessions. Seeds obtained from NASC were expected to represent genetically uniform accessions and were propagated for one generation. For the selection treatment, experimental populations with 20 seeds from each accession were planted in cages. 17 days after sowing, different aphid species or a mixture were released into the cages (none as a control). After 60 days, all plants in a cage were harvested and a population of 800 seeds were sowed representing the next generation. Each population was maintained over five generations of selection, with six replicate populations per selection treatment. At the end of the selection experiment, the genetic composition of the populations was assessed (25).

Experimental design

For the study in a common environment, we focused on three well-surviving accessions from the selection experiment: Ma-0, Wei-0, and Wil-3. The accessions Ma-0 and Wei-0 were chosen because they were the most frequent after the aphid selection treatments. Wil-3 was chosen because it is a trichome-less accession that persisted across all treatments. We further focused on three selection treatments: the two single-aphid treatments with Bravicoryne brassicae (BB) and Liphaphis erysimi (LE), and the control without aphid treatment. Additionally, we included offspring from the plants that were used to set up the original selection experiment, here referred to as “ancestrals”. All selected plants were grown together for one generation to minimize confounding maternal effects. For each treatment and accession, we planned to use five mother plants from each of the six populations per selection treatment. However, due to the rarity of some accessions in certain populations, we supplemented with mothers from other populations of the same selection treatment to fill up missing plants (360 maternal seed families were planned but ultimately only 261 were used). We accounted for this in our analysis by using population identity (n = 6 replicates x 3 selection treatments + ancestrals = 19) and maternal plant identity (n = 261 < 5 mothers x 3 accessions x 18 populations for control and selection-treatment plants + 30 mothers x 3 accessions for ancestrals = 360) as random terms. The ancestrals did not stem from multiple populations, but we planned to use an equal amount of mother plants per accession (31). After the first generation of the present experiment (the sixth generation (G6) in reference to the selection experiment), between 5 (planned) and 150 (if there was only a single mother available) offspring of each mother were grown and phenotyped in the second (G7) generation (n = 1,800). Both generations were grown in a common environment in a fully randomized block design. Hence, if there were differences between selection treatments, they would have been maintained for two generations in the absence of selection by aphids.

G6: genotyping of offspring from the selection experiment

In the selection experiment, seeds were collected from all plants per cage (= population) at once. Thus, the accessions were mixed within populations. To identify the desired accessions, 2,000 seeds were sterilized and sown on Murashige and Skoog agar plates and stratified for three days at 4 °C. Plates were transferred to an incubator with a day/night cycle of 18h/6h at a constant temperature of 22°C. Seven days after germination, we harvested leaves from seedlings resembling the Ma-0, Wei-0, or Wil-3 accession. Seedlings were genotyped as described previously (25). Genotyped seedlings were transplanted onto soil 14 days after germination and were grown in a controlled climate chamber at 18° C under a day/night cycle of 16h/8h. Seeds of plants from this G6 generation were collected from each plant separately to start the G7 generation with one seed per plant (n = 1,800, 1,759 surviving).

G7: genotyping and phenotyping

The plants of the G7 generation were arranged in a randomized layout on agar plates and later trays (24 plants per tray). We changed from square plates to circular plates after the first three blocks. This caused a change in thickness of the MS agar on the plates and subsequently slower growth in blocks 4 and 5 than in blocks 1 to 3. We accounted for these differences during statistical analyses by using a corresponding contrast term within the term blocks. Plants were transplanted onto soil 14 days after germination and grown in a controlled climate chamber at 18° C under a day/night cycle of 16h/8h. Plants were grown in five randomized blocks of 360 plants with offspring of each mother present in each block. Plants were not switched between blocks when transferred from agar to soil. The remaining seedling DNA, which was not used for genotyping, was stored at –20 °C for later sequencing experiments.

Several traits were measured during the G7 generation (cf. 4). The days to bolting was measured as an indication for flowering time. The number of rosette leaves and the rosette diameter (longest distance) were both measured at the day of bolting. 21 days after germination, the fifth true leaf was removed and stuck onto a foil for trichome number counts and leaf blade area measurements. To count trichomes, leaves were imaged on a Leica DMR bright-field microscope. Leaf area was measured by scanning the foil with an office scanner and extracting the leaf area with the ImageJ plugin LeafJ (48).

Statistical analyses

Statistical models

All analyses were carried out in R Version 3.6.1. Raw data are available (Supplementary data S4). The design comprised the following factorial terms that were not of interest per se: accession (ET, 3 levels), block (BL, 5 levels with a fixed-effect contrast EX comparing blocks 1–3 with blocks 4–5), tray (TR, 75 levels), population identity (LI, 18 levels + 1 ancestral per accession), mother plant identity (MO, < 120 levels per accession, 261 in total). The factorial term of interest was the selection treatment (Selection, 4 levels), which we split into three orthogonal one-degree of freedom contrasts to compare (i) ancestral plants with plants from the selection experiment (AN), (ii) the control treatment with the two selection treatments (CO), and (iii) the two selection treatments with each other (SE). We used both, the regular lm() function with specifying the appropriate statistical tests manually and lmer() that handles linear mixed models with random terms (49). With the regular lm(), we used the following two formulae for the explanatory terms:

  • (EX+BL)+TR+(AN+CO+SE+LI)+ET+(AN+CO+SE+LI):ET+MO (m1)

  • (EX+BL)+TR+(AN+CO+SE+LI)+ET+EX:(AN+CO+SE+LI)+EX:ET+LI:ET+MO (m2)

The term selection and its contrasts were tested against the term population identity (LI). Interactions of selection contrasts with accession (e.g., SE:ET) or with the block contrast (e.g., SE:EX) were tested against the interaction between population identity and accession (LI:ET). The accession term itself (ET) was tested against the term mother plant identity within accessions (MO). Both models were also fitted with lmer() using the following formulae:

  • (AN+CO+SE)+ET+(AN+CO+SE):ET+(1|BL)+(1|TR)+(1|MO)+(1|LI/ET) (m1)

  • EX+(AN+CO+SE)+ET+EX:(AN+CO+SE)+ET:EX+(1|BL)+(1|TR)+(1|MO)+(1|LI/E T) (m2)

The two approaches yielded practically identical results and we refer to the results obtained with the classical lm() function in the main text. Results from the lmer() tests are given in Supplementary data S1.

We ran the models for all phenotypic traits separately. We transformed all traits using the logarithm and the square root and tested whether the transformations improved the normality assumption for the residuals using a Shapiro-Wilk test (50). In general, transformations did not improve the normality assumption except for trichome density, which was then transformed using the logarithm. We further assessed the effects of outliers by removing potential outliers with the ROUT method using the accession and the selection treatment for grouping (51). Wil-3 had only one LE and three BB mothers, meaning that the two selection treatments were poorly replicated for this accession. As Wil-3 lacks trichomes, it was excluded from the trichome density analysis. We also ran all the models without Wil-3. We report all results to demonstrate that we tested exhaustively.

Power estimations

To estimate the statistical power to detect selection effects within genotypes (i.e., accessions), we employed the same models as described above and simulated data using the same data structure as in the real data. We simulated five different settings with effect sizes ranging from 1 to 20%: (i) differences between all four groups with step sizes of 1, 5, 10, 15, and 20% between each group (with the level following AN<CO<BB<LE, note that the largest difference here is 60%); (ii) only different in ancestors; (iii) only different in controls; (iv) different in both selection treatments; (v) different in only one selection treatment. For each setting, we generated 1,000 data sets and employed the models as described above. We then summarized the percentage of significant outcomes for the different settings (Supplementary data S2). In brief, we could detect effect sizes of 10% in more than 80% of the simulated data sets. An exception was trichome density for which the effect size had to be 20% for a robust detection. Note that an effect size of 20% is generally considered small (52).

DNA methylation

Library preparation

We selected 7, 12, and 13 mother plants from the ancestral, BB, and LE treatments (Wei-0 accession). Libraries for whole-genome bisulfite sequencing (WGBS) were prepared as previously described (4), using the DNA extracted for genotyping of the plants in the G6 generation and sequenced on an Illumina HiSeq 2500 (250 bp paired-end). In total, we obtained 293,022,365 read-pairs. On average, samples had about 8.6 mio reads. The lowest were 3.8, 6.4, and 6.5 mio, the highest were 11.7, 12.8, and 13.8 mio reads. Mapping efficiencies were generally similar (between 37.8 and 48.2%) but two samples had clearly lower mapping efficiencies (6.0 and 13.4%). These two samples were removed from the analysis (A18 and B26).

Data processing

Short reads were (quality) trimmed with fastp (version 0.20.0 with the options -- trim_front1 6 --trim_tail1 6 --trim_front2 6 --trim_tail2 6; 53) and aligned to the reference genome with Bowtie2 (version 2.3.5.1, 54) in combination with Bismark (version 0.22.3 with the option --dovetail, 55). The reference genome for the Wei-0 accession was obtained from 1001genomes.org (GMI-MPI release v3.1, pseudogenomes, pseudo6979.fasta.gz) and plastid sequences were added from the Col-0 reference genome (TAIR10). Alignments were deduplicated with Bismark and methylation tables were extracted with MethylDackel (version 0.5.0, github.com/dpryan79/MethylDackel). Files were merged and only cytosines with a coverage between 5 and 100 within at least two samples per group were kept (434,671, 479,201, and 2,669,064 in the CG, CHG, and CHH context, respectively; in total about 8.5% of all available cytosines). The three experimental groups (ancestrals and B. brassicae or L. erysimi aphid treatments) were compared to each other with the R-package DSS (version 2.32.0, 56) using the functions DMLfit.multiFactor(), DMLtest.multiFactor(), and callDMR(). The P-value threshold for the DMR calling was set to 0.01. Finally, all P-values were corrected for multiple testing and thereby converted to false discovery rates (FDRs, 57).

To test whether the extent of epigenetic variation changed during the experiment, we compared variation within original replicate populations with at least 4 samples with the function betadisper() from the R-package vegan (version 2.5.7, 9, 58). The function was also used to extract the distances to the centroids. From the 32 samples, 28 were within original replicate populations with at least 4 samples: 7 ancestrals, 12 from the L. erysimi treatment (3 original replicate populations), and 9 from the B. brassicae treatment (2 original replicate populations). Epigenetic distances between samples were calculated as the average difference in percent methylation across all cytosines (8) that passed the following filter: coverage between 5 and 100 in at least 5 individuals per treatment (n = 901,843).

To functionally characterize the genes associated with differentially methylated cytosines, we tested for enrichment of gene ontology (GO) terms using topGO 2.20 (59) in conjunction with the GO annotation available through biomaRt (60). Analysis was based on gene counts (genes with DMCs within 2kb compared to all annotated genes) using the “weight” algorithm with Fisher’s exact test (both implemented in topGO). A term was identified as significant within a given parameter combination if the P-value was below 0.01.

Analysis of genetic variation in the subset 190 plants studied by Lin and colleagues (26)

To assess whether there was any genetic variation associated with the treatments in the experiment described by Lin and coworkers (26), we used their RNA-seq data (PRJNA997595) on a subset of 190 plants that supposedly included 4 accessions. To obtain DNA sequence variant data from RNA-seq data, we followed, as far as possible (i.e., base recalibration required known sites that were not available), the GATK-guidelines (https://gatk.broadinstitute.org/hc/en-us/articles/360035531192-RNAseq-short-variant-discovery-SNPs-Indels?id=3891). In brief, short reads were (quality) trimmed with fastp (version 0.23.4, 51) and aligned to the reference genome with STAR (version 2.7.11a with the options --alignIntronMax 10000 --alignMatesGapMax 10000 --outSAMstrandField intronMotif and supplying the sample ID as read-group information; 61). Only unique alignments with a quality of at least 250 were kept. Read duplicates were marked with Picard tools (version 2.18.25, broadinstitute.github.io/picard) and spliced reads were split with GATK SplitNCigarReads. Bam files were merged into a single file to reduce running time while calling variants. Data were subset to 10,000 randomly chosen genes. Variants were finally called with GATK HaplotypeCaller (with the options --dont-use-soft-clipped-bases --standard-min-confidence-threshold-for-calling 20.0) and filtered with GATK VariantFiltration (with the options --window 35 --cluster 3 --filter-name “FS” --filter “FS > 30.0” --filter-name “QD” --filter “QD < 2.0”). The resulting VCF file was then filtered for variants with a coverage between 10 and 1,000 within at least four plants per accession and treatment. Variants with a minor allele frequency below 5% were removed. 13,418 variants passed this filter. We did not filter accessions separately when plotting them individually to maintain the distance values. Genetic pairwise distances were calculated as the fraction of alleles that differed between two individuals (62). We visualized the distances as described previously (62). When we observed that some plants within two of the accessions were clearly distinct from each other, we used the data on more than 1000 Arabidopsis accessions available from 1001genomes.org (https://1001genomes.org/data/GMI-MPI/releases/v3.1/1001genomes_snp-short-indel_only_ACGTN.vcf.gz). We intersected the two data sets (12,686 variants remained) and calculated all pairwise distances between the experimental plants and the publicly available accessions. Experimental plants were then assigned to the accession that was closest to it (Supplementary data S3).

Acknowledgements

We thank Tobias Züst (University of Zurich) for providing the seeds which served as starting material for this study, Christian Heichinger (Roche) for advice on genotyping and experimental setup, and Christina Westermann (University of Zurich) for comments on the manuscript; Vimal Rawat, Gregor Rot, and Deepak Kumar Tanwar (University of Zurich) for discussions; Valeria Gagliardini, Christof Eichenberger, Daniela Guthörl, Arturo Bolaños, and Peter Kopf (University of Zurich) for general lab support; and Karl Huwiler (University of Zurich) for plant care. This work was supported through core funding of the University of Zurich and a project of the University Research Priority Program “Evolution in Action”.

Additional information

Author contributions

UG and BS conceived and supervised the project. KK performed the experiments. MWS, KK and BS analyzed the data. SW contributed to data analysis. MWS, KK, and UG wrote the manuscript with input from the other authors. All authors read and approved the final manuscript.

Accession codes

Short reads were deposited at the NCBI Sequence Read Archive (SRA, www.ncbi.nlm.nih.gov/sra) and are available through accession number PRJNA1103971.

Additional files

Supplementary data 1. Collection of workbooks with results from the different analyses. The files are named as <PHENOTYPE>_<DATA set>.xlsx. For each phenotype, there are analyses with all three accessions (“all”), only the two accessions with trichomes (“NoWil3”), or with only one accession (“OnlyWei0”, “OnlyMa0”, and “OnlyWil3”). For each of them, there is one version with all data included and one in which the extreme values were removed (”.outliersRemoved”). For example: bladeArea_all.outliersRemoved.xlsx refers to the analysis of the blade area using all three accessions and removing extreme values. Within each workbook, there are three sheets for untransformed data (“y”), log-transformed data (“logy”), and square-root-transformed data (“sqrty”). Within each sheet, the output of the two models tested either with the regular linear model (lm) or the random effects model (lmer) are given. In case of the regular lm output, alternative F- and P-values (F_altern and P_altern) refer to the tests against the correct error stratum (i.e., nested model). The columns F and P contain the F- and P-values from tests against the residuals. Regarding the model terms: “ Blocks” is the five level factor of experimental blocks that was split into two contrasts (“EX” and “BL”); “ Selection” is the four-level factor with the selection treatments that was split into three contrasts (“AN”, “CO”, and “SE”). “ Selection_X_accession/experiment” is accordingly the interaction of the four-level factor with the selection treatments and the accession (“ET”) or the “EX” contrast. Abbreviations: Df: Degrees-of-freedom, SS: Sum of squares, MS: Mean squares, F: F-value, P: P-value, F_altern: F-value of nested model, P_altern: P-value of nested model, perc_SS: %-Sum of squares explained by the term.

Supplementary data 2. A workbook containing tables from the power estimations. For each phenotype, power was estimated using all three accessions or only the two accessions with trichomes. In case of trichome density, only the latter is available. Each table contains the percentage of significant outcomes for a given simulation and factor/contrast of interest (based on 1,000 simulations). The simulations are named as <TYPE>_<EFFECT size>. There were five different types of simulations with effect sizes ranging from 1 to 20%: (“step”) differences between all four groups with step sizes of 1, 5, 10, 15, and 20% between each group (with the level following AN<CO<BB<LE, note that the largest difference here is 60%); (“onlyAn”) only different in ancestors; (“onlyCo”) only different in controls; (“onlySe”) different in both selection treatments; (“onlyLe”) different in only one selection treatment.

Supplementary data 3. Table with the samples from Lin and colleagues (26). Given are the sample IDs, line ID, treatment, generation, accession label (used by Lin an colleagues), and the accession assignments based on the genotypes inferred from the RNA-seq data and the publicly available data from 1001 Arabidopsis accessions (this study).

Supplementary data 4. Table with the phenotype data. Rosette diameter in mm, blade area in mm2. Factors are labelled as described above in Materials and Methods.