1. Ecology
  2. Genomics and Evolutionary Biology
Download icon

Natural genetic variation in Arabidopsis thaliana defense metabolism genes modulates field fitness

  1. Rachel Kerwin
  2. Julie Feusier
  3. Jason Corwin
  4. Matthew Rubin
  5. Catherine Lin
  6. Alise Muok
  7. Brandon Larson
  8. Baohua Li
  9. Bindu Joseph
  10. Marta Francisco
  11. Daniel Copeland
  12. Cynthia Weinig
  13. Daniel J Kliebenstein Is a corresponding author
  1. University of California, Davis, United States
  2. University of Utah, United States
  3. University of Wyoming, United States
  4. Cornell University, United States
  5. Misión Biológica de Galicia, Spain
  6. University of Copenhagen, Denmark
Research Article
Cited
17
Views
1,831
Comments
0
Cite as: eLife 2015;4:e05604 doi: 10.7554/eLife.05604

Abstract

Natural populations persist in complex environments, where biotic stressors, such as pathogen and insect communities, fluctuate temporally and spatially. These shifting biotic pressures generate heterogeneous selective forces that can maintain standing natural variation within a species. To directly test if genes containing causal variation for the Arabidopsis thaliana defensive compounds, glucosinolates (GSL) control field fitness and are therefore subject to natural selection, we conducted a multi-year field trial using lines that vary in only specific causal genes. Interestingly, we found that variation in these naturally polymorphic GSL genes affected fitness in each of our environments but the pattern fluctuated such that highly fit genotypes in one trial displayed lower fitness in another and that no GSL genotype or genotypes consistently out-performed the others. This was true both across locations and within the same location across years. These results indicate that environmental heterogeneity may contribute to the maintenance of GSL variation observed within Arabidopsis thaliana.

https://doi.org/10.7554/eLife.05604.001

eLife digest

‘Genetic variation’ describes the naturally occurring differences in DNA sequences that are found among individuals of the same species. These genetic differences arise from random mutations and may be passed on to their offspring. Some of these mutations may improve the ability of an individual to survive and reproduce—known as fitness—and are likely to become more common in the population. Other mutations may reduce an individual's fitness and are likely to be lost. However, it is believed that most of the mutations will have no effect on the fitness of individuals.

It is not known why many of these ‘neutral’ genetic differences are maintained in populations. Some researchers have proposed that they are kept by chance and that there is no direct advantage to the population of keeping them unless these neutral mutations later become beneficial. However, other researchers think that the genetic variation itself may improve the fitness of the population by allowing it to quickly adapt to changes in the environment.

Arabidopsis thaliana is a small plant that lives in many different environments and has high levels of genetic variation in many of its physical traits. One of these traits is the production of molecules called glucosinolates, which help the plants to defend against herbivores and infection by microbes. Previous studies have suggested that variation in the genes that make glucosinolates may improve the fitness of A. thaliana populations.

To test this idea, Kerwin et al. carried out a field trial using A. thaliana plants that were genetically identical except for some of the genes involved in the production of glucosinolates. Kerwin et al. grew the plants in several different environments over several years. The field trial shows that variation in these genes affected the fitness of the plants in each of the different environments. However, the fitness benefit depended on the environment, and no single gene variant provided the best fitness across all environments, or over all the years of the trial.

Kerwin et al.'s findings suggest that changes in the environment may contribute to the maintenance of genetic variation in the genes that make glucosinolates. This raises the questions of how many other genes in plants (or other species such as humans) have genetic variation that contributes to fitness across varied environments; and how can this link be tested in natural settings.

https://doi.org/10.7554/eLife.05604.002

Introduction

High levels of standing variation have often been observed among many natural plant and animal populations. This is particularly true for the model species Arabidopsis thaliana, which exhibits variation both within and among natural populations and/or accessions (Pigliucci and Marlow, 2001; Atwell et al., 2010; Bomblies et al., 2010; Chan et al., 2010; Platt et al., 2010; Cao et al., 2011; Debieu et al., 2013; Joseph et al., 2013; Long et al., 2013; Anwer et al., 2014; Li et al., 2014). Models based on mutation-selection balance theory predict that this observed variation will be due to rare alleles at many loci introduced through random mutations that evolution acts on to eliminate through persistent purifying natural selection (Kimura, 1968; Turelli, 1984). In agreement, studies of nucleotide variation in Arabidopsis have found an excess of low frequency polymorphisms relative to expectation (Purugganan & Suddith, 1998, 1999). However, other studies cloning causal genetic variants from natural Arabidopsis accessions have found several intriguing examples of intermediate frequency alleles maintained at polymorphic loci (Johanson et al., 2000; Long et al., 2000; Li et al., 2014). This variation among loci has led to a long-standing interest in elucidating to what extent this genetic variation is neutral in origin or, alternatively, maintained through selective forces (Levene, 1953; Hedrick et al., 1976; Bull, 1987; Stahl et al., 1999; Prasad et al., 2012).

The neutral theory posits that the majority of genetic polymorphisms have no effect on fitness and that stochastic evolutionary processes, such as genetic drift and migration, are sufficient to explain the genetic and phenotypic variation observed within and among populations (Darwin, 1859; Kimura, 1968; Duret, 2008). This hypothesis has generated numerous modeling studies demonstrating that the standing level of genetic variation in traits can be explained by the demographic history of a species not linked to fitness of an individual (Wolf et al., 2000; Barton and Turelli, 2004; Hufford et al., 2012; Pyhajarvi et al., 2013). However, for many ecologically important traits, phenotypic variation has been shown to empirically impact fitness in natural populations, suggesting that natural selection also plays an important role in the evolution of such traits (Mothershead and Marquis, 2000; Adler et al., 2001; Tian et al., 2003; Korves et al., 2007; Milla et al., 2009). A key step necessary to begin to resolve these discrepancies between theory and empirical observations requires the validation of fitness consequences of variation at specific loci or pathways in the field (Turelli and Barton, 2004; Fournier-Level et al., 2011; Hancock et al., 2011).

Determining the impact of polygenic variation upon fitness in the field informs our understanding of the potential selective and non-selective evolutionary processes that protect or maintain phenotypic variation within a species, such as genetic drift and balancing selection (Kimura, 1968; Hedrick et al., 1976; Mitchell-Olds et al., 2007; Mojica et al., 2012). However, most population level studies of evolution and selection in the field have focused on polygenic populations and have been unable to validate the link between variation at specific underlying genes and the resulting fitness consequences of this variation (Lande and Arnold, 1983; Mitchell-Olds and Rutledge, 1986; Gillespie and Turelli, 1989; Orr, 1998). Studies using structured mapping populations, such as Arabidopsis RILs, can only associate large genomic regions, rather than individual genes, with quantitative variation in fitness (Weinig et al., 2003; Stinchcombe et al., 2004; Juenger et al., 2005; Malmberg et al., 2005). More recently, genome wide studies using A. thaliana accessions have been able to associate SNPs to fitness in the field and even predict relative fitness of accessions grown in a common garden (Fournier-Level et al., 2011; Hancock et al., 2011). However, these associations between loci and fitness need more refining to validate the effect of individual genes. Testing if individual genes impact fitness in the field first requires identifying and cloning the causal genes underlying the phenotypic variation of interest (Mitchell-Olds, 1995; Tian et al., 2003; Mitchell-Olds et al., 2007). Then, these natural alleles need to be recreated as single gene lines, which can require approaches such as chemical mutation (e.g., EMS), generation of transgenic individuals via Agrobacterium-mediated transformation, and/or generation of isogenic lines through successive rounds of backcrossing. Therefore, empirical field testing of individual causative polymorphic genes has only been done rarely, and we do not yet have a good understanding of the extent to which individual genes impact fitness in the field (Tian et al., 2003; Schuman et al., 2012).

A. thaliana has become a key model system and is extremely suitable for characterizing, cloning and validating genes influencing the fitness consequences of underlying natural variation. This is due, in part, to the ease of transformation as well as the abundance of genomic resources available for this organism, including an extensive library of T-DNA insertion lines and natural accessions (Alonso et al., 2003). Arabidopsis persists in many different environments and experiences selection from both abiotic pressures, such as temperature and precipitation, and biotic pressures, such as insect and pathogen populations that vary temporally and spatially (Meyerowitz, 1987; Richards et al., 2009). Potentially to maximize fitness across a broad range of biotia, Arabidopsis has evolved high levels of natural variation among accessions for many important phenotypic traits, including the defense compounds, glucosinolates (GSLs) (Stahl et al., 1999; Atwell et al., 2010; Chan et al., 2010). GSLs constitute a diverse set of plant-made defensive metabolites restricted primarily to the Brassicales that are partitioned into three classes, indolic, aliphatic and aromatic, depending on their amino acid precursor. These N and S containing compounds are stored in the vacuoles of plant cells until they are activated through tissue damage, which can occur through insect feeding and pathogen attack. Natural genetic polymorphisms found among a suite of aliphatic GSL genes in Arabidopsis are responsible for the majority of GSL diversity observed in the leaf tissue (Figure 1). These aliphatic GSL genes encode enzymes, transcription factors and activation co-factors that have been identified, cloned and validated in a laboratory setting (Table 1) (Haughn et al., 1991; Li and Quiros, 2003; Hansen et al., 2007, 2008; Hirai et al., 2007; Li et al., 2008; Neal et al., 2010). Previous studies have uncovered links between GSL variation and ecologically important traits in Arabidopsis, such as resistance to insect/pathogen damage, flowering time, and growth, suggesting that GSLs play an important role in determining plant fitness (Mauricio, 1998; Kliebenstein et al., 2002; Bidart-Bouzat and Kliebenstein, 2008; Hansen et al., 2008; Burow et al., 2010; Kerwin et al., 2011; Züst et al., 2011). Since the genes responsible for the majority of natural polymorphism in aliphatic GSL have been well characterized in a laboratory setting, the GSL pathway in Arabidopsis provides a good system for understanding the impact that individual genes might have on fitness in the field (Kliebenstein et al., 2001b; Halkier and Gershenzon, 2006; Hansen et al., 2008). In this study, we tested the fitness consequences of aliphatic GSL variation in the field by utilizing a collection of lines that vary at specific GSL genes in Arabidopsis (Col-0), which recreated observed natural variation in the aliphatic GSL pathway found among accessions (Table 2) (Mauricio, 1998; Kliebenstein et al., 2002; Bidart-Bouzat and Kliebenstein, 2008; Hansen et al., 2008; Burow et al., 2010; Kerwin et al., 2011; Züst et al., 2011).

Figure 1 with 17 supplements see all
Overview of aliphatic GSL biosynthesis and activation in Arabidopsis thaliana.

Arrows represent the steps involved in aliphatic glucosinolate (GSL) biosynthesis that have been validated through laboratory experiments and are naturally variable within A. thaliana. Gene names are listed next to or above the arrows. (A) Regulation of aliphatic GSL biosynthesis. The transcription factors MYB28 and MYB29 control accumulation of aliphatic GLS. A double knockout in these genes results in no aliphatic GLS accumulation, while a single knockout in these genes leads to a 50% reduction in aliphatic GSL (myb28) or a 25% reduction in aliphatic GSL (myb29), compared to WT Col-0. The biosynthetic enzymes MAM1 and AOP2 also influence aliphatic GLS accumulation and a non-functional allele at either locus leads to decreased GSL accumulation. (B) Amino acid chain elongation. During chain elongation, carbons are added to a methionine precursor through a series of reactions producing an elongated amino acid. Variation at the Elong locus controls the number of carbons added to the amino acid precursor and therefore, the length of the GSL side chain, R. A functional allele at this locus, MAM1, leads primarily to accumulation of GSL with four carbon (4C) length side chains, whereas a non-functional allele, gsm1 leads to accumulation of GSL with three carbon (3C) length side chains. The elongated amino acid is subsequently converted into a GLS via the core pathway (not shown). (C) Side chain modification. The GSL compounds produced can then undergo a series of side chain modifications that lead to the suite of diverse GSL compounds found in Arabidopsis. Side chain modification is controlled by variation of GSOX1, GSOX3, AOP2, AOP3 and GSOH. GSOX1 & GSOX3 oxygenate a methylthio (MT) to methylsulfinyl (MSO) GSL. AOP2 converts MSO to alkenyl, such as allyl and but-3-enyl. AOP3, on the other hand converts only 3C length MSO to OH-propyl GSL and cannot act on the 4MSO GSL. GSOH oxygenates the 4C but-3-enyl to the OH-alkenyl GSL, OH-but-3-enyl. Since GSOH acts on but-3-enyl GSL, which is a product of AOP2, a functional AOP2 is necessary for GSOH to function and AOP2 is said to be epistatic to GSOH. Col-0 is functional for MAM1 and the GSOX's, null for both AOP2 and AOP3, and functional for GSOH, resulting in accumulation of primarily 4MSO GLS. See Figure 1—figure supplements 1–17 for images of GSL traces for each GSL genotypes in our mutant laboratory population. (D) GSL Activation. Once produced, GLS are stored in the vacuole in their stable, unreactive form until activation occurs. Upon cellular disruption, such as occurs during pathogen attack, insect herbivory or even wind damage, GLS come into contact with their own plant-made activating enzyme, myrosinase. After production, myrosinase is stored in vacuoles of idioblastic cells called myrosin bodies. Myrosinase activates the GSL compound by cleaving the glucose moiety, yielding an unstable aglycone structure that non-enzymatically rearranges to either nitriles or isothiocyanates, depending on the presence of the co-activators ESM1 and ESP.

https://doi.org/10.7554/eLife.05604.003
Table 1

Polymorphic genes involved in aliphatic GSL synthesis and activation

https://doi.org/10.7554/eLife.05604.021
Gene nameLocusATG #Gene typeGene functionMutation type in Col-0
MYB28MYB28At5g61420TFPositive regulator of aliphatic GSL (Sønderby et al., 2007, 2010)T-DNA
MYB29MYB29At5g07690TFPositive regulator of aliphatic GSL (Sønderby et al., 2007, 2010)T-DNA
MAM1MAM1At5g23010EnzymeControls 3C–4C chain elongation (Haughn et al., 1991; de Quiros et al., 2000; Kroymann et al., 2003)EMS
GSOX1GSOXAt1g65860EnzymeConverts MT to MSO GSL (Hansen et al., 2007; Li et al., 2008)T-DNA
GSOX3GSOXAt1g62560EnzymeConverts MT to MSO GSL (Hansen et al., 2007; Li et al., 2008)T-DNA
AOP2AOPAt4g03060EnzymeConverts MSO to alkenyl GSL (Kliebenstein et al., 2001c)35S OX
AOP3AOPAt4g03050EnzymeConverts 3MSO to hydroxy-propyl GSL (Kliebenstein et al., 2001c)n/a
GSOHGSOHAt2g25450EnzymeConverts butenyl to OHB (Hansen et al., 2008)T-DNA
ESPESPAt1g54040Co-factorGuides formation of activated GSL to nitriles (Lambrix et al., 2001)35S OX
  1. Shown are the identities, functions and mutation types of nine genes representing seven loci important for aliphatic GSL synthesis and activation. These genes were chosen for mutant laboratory population of Arabidopsis thaliana accession Col-0 due to the fact that they represent the majority of aliphatic GSL variation observed in Arabidopsis. Each of these genes is naturally polymorphic among Arabidopsis accessions.

Table 2

Allelic variation of polymorphic aliphatic GSL loci in structured population

https://doi.org/10.7554/eLife.05604.022
GenotypeMYB28MYB29MAM1GSOX1GSOX3AOP2GSOHESP
Col-0++++++
myb28+++++
myb29+++++
gsm1+++++
gsox1+++++
gsox3+++++
AOP2++++++
AOP2/gsoh++++++
Gsoh+++++
Myb28/gsoh+++++
Myb28/gsm1++++
Myb28/AOP2++++++
Myb29/gsm1++++
Myb29/AOP2/gsoh++++++
Myb28/myb29++++
Myb28/myb29/gsoh+++
ESP+++++++
  1. Shown are the genotypes in the mutant laboratory GSL population used in this study and the allelic state of each gene within each of them. Each gene in our population is naturally polymorphic among Arabidopsis accessions. See Table 1 for gene functions. For each gene listed, a ‘+’ indicates a functional allele and a ‘−’ indicates a non-functional allele. The loss-of-function and gain-of-function mutant lines shown in Table 1 were manually crossed to generate this population of genotypes, each of which vary from Col-0 at only these eight genes, including single, double, and triple mutants.

Results

Synthetic laboratory population mimics natural GSL variation in Arabidopsis

The GSL profile of a plant is characterized by the presence and relative abundance of the various GSL structures it produces. Among Arabidopsis accessions, GSL profiles show extensive phenotypic variation across the species geographic distribution (Figure 2) (Chan et al., 2010). While previous studies have linked GSL profile variation to insect resistance, as well as correlated the geographic distribution of insect populations with GSL profile-type across Europe, it is still not known to what extent, if at all, individual GSL genes affect fitness in the field (Mauricio, 1998; Bidart-Bouzat and Kliebenstein, 2008; Züst et al., 2012). To test if standing genetic variation within the aliphatic GSL defensive pathway of A. thaliana impacts fitness in the field, we utilized an existing set of genotypes that recreate natural variation at eight specific GSL loci, with the reference accession, Col-0, as the genetic background. These transgenic lines consisted of loss-of-function T-DNA insertion lines, an EMS mutant and gain-of-function overexpression lines that were originally created to validate individual genes as causal for GSL natural variation (Table 1). For example, the AOP2 gene was found to encode an enzyme that converts methylsulfinyl (MSO) GSL into alkenyl GSL (Figure 1 and Figure 1—figure supplement 7) (Kliebenstein et al., 2001c). Importantly, the AOP2 gene is polymorphic among Arabidopsis accessions, with Col-0 accession containing a natural knockout that abolishes its function. Therefore, introducing the functional allele back into Col-0 created a single gene mimic of the natural variation found in Arabidopsis (Figure 1 and Table 1) (Kliebenstein et al., 2001c). The natural variation at the other causal genes has been similarly mimicked as described in the listed citations (Table 1). This was facilitated by the fact that all of these genes contain natural presence/absence polymorphisms (citations in Table 1).

Globally distributed collection of Arabidopsis thaliana accessions that vary with respect to GSL haplotype.

Shown are the geographic origins of 144 Arabidopsis accessions across (A) Europe and Northern Africa, (B) North America and (C) Japan, as well as their corresponding GSL haplotypes and chemotypes. GSL haplotype names correspond to allelic identity at six polymorphic loci involved in aliphatic GSL production, based on GSL profile data collected from each accession. Haplotype names use Col-0 as a reference, which is functional at four or the six loci. Symbol shape, color and size indicate GSL chemotype (i.e., phenotype based on GSL profile). Red = 3C (non-functional MAM1), green = 4C GSL (functional MAM1), triangle = MSO (non-functional AOP), square = alkenyl (functional AOP2), circle = OH-alkenyl (functional GSOH), star = OH-Propyl (functional AOP3), point size 1 = 100% accumulation of aliphatic GSL (compared to Col-0), point size 0.5 = 50% accumulation of aliphatic GSL (non-functional MYB28) and point size 1.5 = 75% accumulation of aliphatic GSL (non-functional MYB29). See Figure 2—source data 1 for table of accession geographic information, Figure 1 for schematic of biosynthetic pathway and Figure 9 for more details on the allelic state at each locus for all 18 GSL haplotypes.

https://doi.org/10.7554/eLife.05604.023

Each of these transgenic lines had been backcrossed to Col-0 several times to remove unlinked polymorphisms in the original studies (Table 1). For this study, the transgenic lines were manually crossed to each other to represent the phenotypic variation in GSL profiles found among Arabidopsis accessions (Table 2, Figures 1, 2). This synthetic laboratory population varies at specific genes controlling aliphatic GSL variation within a single common genetic background. Utilizing this synthetic laboratory population, we can explicitly measure the impact of variation in a suite of aliphatic GSL genes on fitness components in the field without confounding variation in other regions of the genome.

We tested our population in multiple environments, which allowed us to separate the effects of genotype from environment, to determine if traits measured in the field are environmentally controlled. This could be particularly important if selection pressures fluctuate across environments. We transplanted 2 week old, greenhouse-germinated replicates of the synthetic laboratory population into the field at the University of California, Davis in Davis, CA in Spring 2012 and the University of Wyoming in Laramie, WY in Summer 2011 and Summer 2012. In each of our three field trials, which represent three environments, genotypes were replicated in 40 randomized blocks in the field, for a total of 120 blocks/replicates. To distinguish the effects of GSL variation alone from the interaction of GSL variation with field herbivory as well as assess the effects of leaf damage in the field, half of the blocks in each field trial were treated with pesticides and the other half were not (Figure 3) (Mauricio, 1998).

Split-plot field trial setup.

Shown is the field trial setup used in all three environments. In each environment, 40 blocks were arranged into rows of 10 blocks and each row was called a plot. Within each block, the complete set of 17 genotypes was randomly organized, for a total of 40 genotype replicates per environment. Each plot (four per environment) was placed into one of two treatment groups. The ‘− Herbivory’ treatment group received pesticide application to prevent leaf damage (shown in blue). The ‘+ Herbivory’ or control treatment group did not receive pesticide application (shown in red). This setup was repeated in each of the three environments, for a total of 120 blocks/genotype replicates and 12 plots, split between the two treatment groups. Environment and treatment were nested within plot, making this field trial setup a split-plot design. Seedlings were transplanted from the greenhouse into the field at 2 weeks of age where they were allowed to flower and then subsequently harvested for further analysis in the laboratory.

https://doi.org/10.7554/eLife.05604.025

GSL genetic variation controls GSL profile in the field

Since the genes underlying variation in the aliphatic GSL pathway investigated in this study have been previously validated using lab techniques, we have a solid working knowledge of the resulting laboratory GSL profiles (Beekwilder et al., 2008; Hansen et al., 2008) (Figure 1—figure supplements 1–17). However, these GSL genotypes have not previously been tested in the field to determine if they produce the same GSL profiles as when grown in the laboratory. We particularly wanted to assess if variation at individual aliphatic GSL genes has the same impact on GSL profile in the field as predicted from published lab experiments when the plants are grown in different complex environments, and therefore measured GSL on all the plants grown in each of our three field trials. A mixed model analysis of field GSL revealed that the majority of variation in GSL profiles in the field was controlled by the GSL genotypes that we generated (Table 3). Importantly, the majority of the GSL genotypes produced the expected GSL profiles in the field, consistent with the lab studies (Figure 4 and Figure 1—figure supplements 1–17). To quantify the similarity in profiles between field and lab grown samples, we conducted a PCA analysis using the GSL profiles of these genotypes grown in a growth chamber. The first four vectors from our PCA were able to explain >99% of the variation in GSL profile. We utilized the loadings from the chamber PCA to estimate PCA scores of the first four vectors using the chamber GSL and field GSL. The scores for the field grown genotypes were highly correlated with the lab grown genotypes, showing that the GSL genetic variation leads to highly similar field and lab profiles (Table 4).

Table 3

Mixed model table of leaf damage, flowering time and GSL in the field

https://doi.org/10.7554/eLife.05604.026
Fixed effectsLeaf damageFlowering time
Source of variationdfSSMSFp valuedfSSMSFp value
Genotype16280183.77.6E-071674714679.45.8E-23
Environment22071047.50.0223660318302464.68.6E-08
Treatment117170.311071077.30.04
Geno:Env32617194.07.2E-13322678843.05.3E-08
Geno:Trt167551.016655411.3
Env:Trt232161.925262636.50.03
Geno:Env:Trt3220161.3321115351.3
Random effectsLeaf damageFlowering time
Source of variationdfSSMSχ2p valuedfSSMSχ2p value
Plot(Trt:Env)1008.70.0031000.0
Residual190450NA1750260NA
Fixed effectsTotal aliphatic GSLTotal indole GSL
Source of variationdfSSMSFp valuedfSSMSFp value
Genotype16250927315683034.63.0E-95163343220907.31.1E-16
Environment21458872941.6215207600.6
Treatment1143014300.31220822084.3
Geno:Env3230599395622.13.3E-0432141394421.50.03
Geno:Trt1610013962591.41674884681.6
Env:Trt2393819690.426103051.0
Geno:Env:Trt3211626936330.83295312981.0
Random effectsTotal aliphatic GSLTotal indole GSL
Source of variationdfSSMSχ2p valuedfSSMSχ2p value
Plot(Trt:Env)119319372.65.6E-161151567.52.2E-16
Residual149043422NA14912690NA
  1. A linear mixed model was fitted to phenotypes measured on plants grown in the field. Variation was partitioned among the fixed effects, Genotype, Environment, and Treatment as well as a random factor, Plot, inside which Treatment and Environment were nested. Phenotypic data used in the model was collected on 17 genotypes from three environments and two treatment groups. df = degrees of freedom, SS = Type II Sums of Squares variation, MS = Mean Squared variation, F = F statistic (for fixed factors), χ2 = chi squared statistic (for random factors). p value = probability value from either an F test or a chi squared test, depeding on the source of variation. Non-significant p values (>0.05) are represented by a dash.

Average GSL profiles from select laboratory population genotypes grown in the field.

Shown are the genotype averages of various aliphatic GSL chemical structures from GSL genotypes grown in all three environments in the field. The GSL structures present and the corresponding abundances makeup the GSL profile of an individual. Results are based on single leaf analysis of 4 week old plants (see Table 2 for full list of GSL genotypes used in this study). Each color represents a different aliphatic GSL genotype. Error bars represent standard error of the mean. Letters represent significantly different genotypes based on Tukey's HSD. See Figure 4—source data 1 for full list of GSL genotypes used in this study and the corresponding LSMeans and SE of GSL structures produced by all GSL genotypes used in study averaged across field trials.

https://doi.org/10.7554/eLife.05604.029
Table 4

PCA comparison of GSL profiles produced by GSL genotypes from synthetic laboratory population grown in the chamber and all three environments

https://doi.org/10.7554/eLife.05604.031
EnvironmentPCA1 = 48.5%PCA2 = 29%PCA3 = 16%PCA4 = 6%
Rp valueRp valueRp valueRp value
Chamber1.001.001.001.00
UCD20120.97<0.0010.97<0.0010.82<0.0010.96<0.001
UWY20110.91<0.0010.95<0.0010.74<0.0010.97<0.001
UWY20120.90<0.0010.91<0.0010.85<0.0010.86<0.001
  1. Glucosinolate analysis was conducted on the 17 genotypes within a Long-day growth chamber (16 hr light) set to match the median light regime for the three environmments. Principal component analysis was conducted on the mean glucosinolate accumulation for the aliphatic glucosinoles within the chamber environment. This creates a set of mathematical descriptors of the chemotype variation across the 17 genotypes. The first three eigenvectors were used to generate scores from the lsmeans of the glucosinolates across the 17 glucosinolate genotypes independently for the chamber and three different field environments values. These scores were then correlated to test if the GSL profiles were similar or not across the environments. The R of the correlation to the Chamber scores for the 17 genotypes for each of the three PCA vectors are shown in conjunction with the p value as determined by Pearson correlation. To the right of each PCA vector label is shown the fraction of total variance approximated by the given vector. In total, the four vectors describe >99% of the GSL profile variance.

In addition to the quantitative comparison of profiles, we also investigated the specificity of each locus in producing particular GSL structures to ensure that its field behavior mimicked the lab behavior. We found that, for the most part, each GSL gene produced the expected GSL phenotype in the field. For example, all lines harboring a functional AOP2 gene produce alkenyl GSL (e.g., but-3-enyl GSL) (Figures 1, 4). Additionally, the functional/non-functional allelic state at the MAM1 locus was always predictive of the chain-length of the GSL in the field as predicted from lab experiments. The lines with a functional MAM1, like Col-0, produced more 4C GSL than 3C GSL, while genotypes with a non-functional MAM1 always produced more 3C GSL than 4C GSL (Figure 4) (Haughn et al., 1991). A functional copy of GSOH, the gene encoding the enzyme to create 2-OH-but-3-enyl, always leads to the production of 2-OH-but-3-enyl GSL from but-3-enyl GSL (Figures 1, 4) (Hansen et al., 2008). In addition to the biosynthetic genes, the MYB genes, which encode transcription factors that control accumulation of aliphatic GSLs, showed similar field phenotypes as were found in the lab (Hirai et al., 2007; Sønderby et al., 2007, 2010). Specifically, a non-functional MYB28 leads to an almost complete reduction in long chain (8C) GSL and a 60% reduction in short chain (3C and 4C) GSL (Figure 4) (Hirai et al., 2007; Sønderby et al., 2007, 2010). A non-functional MYB29 leads to a 40% reduction in short chain GSL with no significant reduction in long chain GSL (Figure 4) (Hirai et al., 2007; Sønderby et al., 2007, 2010). A double mutant in MYB28 and MYB29 lead to an almost complete loss of all aliphatic GSLs, as expected (Figure 4) (Hirai et al., 2007; Sønderby et al., 2007, 2010). The only genes for which the field and laboratory GSL profile data differ are GSOX1 and GSOX3, which are two tightly linked genes at the GSOX locus that also contains two additional genes, GSOX2 and GSOX4. In the lab, gsox1 and gsox3 mutants accumulate higher levels of methylthio (MT) GSL than does Col-0, due to reduced expression of a flavin-monooxygenase that converts the MT to MSO GSL (Figure 1) (Hansen et al., 2007; Li et al., 2008). In the field there was no measureable accumulation of MT GSL in any line, likely due to the redundant function of the GSOX2 and GSOX4 genes (Kerwin et al., 2011, 2012; Li et al., 2008). Thus, the field results show that the laboratory work on GSL genotypes and their associated GSL profiles are translatable and predictive of the GSL profiles found in naturally fluctuating environments.

Environment and genetic variation interact to control GSL accumulation in the field

Conducting field trials in multiple environments enabled us to test the effect of different environmental conditions on our field traits. The specific composition of GLSs within a genotype largely did not change across the environments (Table 4). In contrast, the total amount of aliphatic GSL content, that is, the sum of all aliphatic GSLs per sample, showed a significant genotype by environment effect, indicating that impact of environment on total aliphatic GSL accumulation varied among the different GSL genotypes in this study (Table 3 and Figure 5). For example, the AOP2 genotype showed a dramatic variation in total aliphatic GSL across the three field trials (Figure 5). In contrast, a number of other genotypes tended to show similar accumulation across the environments. For example, genotypes with a myb28/myb29 double knockout accumulated virtually no GSL in all three environments. Thus, the GSL genotype is the dominant determinant of GSL profile in the field while total aliphatic GSL accumulation is determined by an interaction of genotype and environment within our laboratory population.

Total aliphatic GSL accumulation of GSL genotypes from the laboratory population grown in the field.

Shown are the genotype averages in all three environments of total aliphatic GSL from individuals grown in the field. Results are based on single leaf analysis of 4 week old plants. Bar color based on Dunnett's multiple comparison procedure. Within each environment, dark grey bars = Col-0 genotype, black bars = genotypes that accumulate significantly more or less total aliphatic GSL than Col-0 (p value ≤ 0.05), light grey bars = genotypes that accumulate suggestively more or less total aliphatic GSL than Col-0 (p value = 0.05–0.1) and white bars = genotypes that are not significantly different than Col-0 (p value >0.1). Error bars represent standard error of the mean.

https://doi.org/10.7554/eLife.05604.032

Leaf damage in the field varies across environment

A critical way in which plant environments fluctuate is with respect to insect populations that vary both temporally and spatially in a manner that could have a profound impact on variation in plant damage (Mauricio, 1998; Richards et al., 2009). To assess if changes in environment impact herbivory levels, we measured leaf damage on a scale from 0–10 in all three field trials, with and without a pesticide treatment (Figure 6). A mixed model analysis showed that leaf damage significantly varied across the three environments but that the pesticide application did not significantly alter leaf damage in the field (Table 3). The UWY2012 field trial (mean = 2.610) had significantly higher levels of leaf damage than both UWY2011 (mean = 1.17, p value <1e-04) and UCD2012 (mean = 1.50, p value <1e-04), though UCD2012 and UWY2011 environments did not differ significantly for leaf damage (p value = 0.44). Field plots were treated with pesticides once every 2 weeks, which did not entirely eliminate leaf damage on the treated individuals. A more aggressive pesticide treatment regime would have been necessary to abolish leaf damage in the treated group. In addition, the levels of leaf damage measured in our study are low relative to other field studies in Arabidopsis (Bidart-Bouzat and Kliebenstein, 2008). The field site was located adjacent to other experimental field sites and greenhouses that also treated for pests, which may or may not have had an impact on the relative levels and/or pesticide resistance of herbivores in the vicinity. This combination of low overall leaf damage levels and the fact that the pesticide treatment did not eliminate leaf damage in the treated group is likely the cause for this lack of a treatment effect. However, there is a significant environment effect for leaf damage, indicating that this trait varied across the three field trials. In fact, we see no significant correlation of leaf damage across the three environments (Table 5). This suggests the three environments experienced differing herbivory pressures. Since we did not measure herbivore levels, we cannot determine whether the differences in leaf damage are the direct result of differences in insect populations. It is interesting to note that the UWY field site showed both the highest and lowest leaf damage levels, demonstrating that there can be potentially large temporal fluctuations in herbivory at a single location (Table 3—source data 2).

Mean normalized leaf damage of GSL genotypes from the laboratory population grown in the field.

Shown are the mean normalized genotype averages in all three environments of leaf damage from GSL genotypes grown in the field. Mean normalization was conducted by first dividing the genotype average of each GSL genotype within an environment to the corresponding environment average. Then, each normalized value was multiplied by the grand mean across all three environments. This was done in order put the leaf damage estimates in each environment on the same order of magnitude to ease visual comparisons of genotypes across environments and to highlight the fact that relative leaf damage of a given GSL genotype varies across environments.

https://doi.org/10.7554/eLife.05604.033
Table 5

Environmental correlations for leaf damage in the field

https://doi.org/10.7554/eLife.05604.035
UCD2012-UWY2012UWY2012-UWY2011UCD2012-UWY2011
Rp valueRp valueRp value
−0.250.010.22
  1. Shown are Pearson's correlations for leaf damage between the different environments. R = correlation coefficients; p value = probability statistic. Non-significant p values are represented by a dash.

Environment interacts with GSL genotype to impact leaf damage in the field

GSL variation is known to affect leaf damage incurred by insect herbivory within a controlled lab setting and we wanted to test if this could also be observed within a naturally fluctuating field setting (Lambrix et al., 2001; Kliebenstein et al., 2002; Beekwilder et al., 2008; Hansen et al., 2008). Within a field environment, levels of leaf damage significantly varied across GSL genotypes, in agreement with the role of GSL in deterring herbivory (Table 3). However, the extent of leaf damage incurred upon different GSL genotypes in the field fluctuated among environments, such that no particular GSL genotype showed a consistent maximal or minimal level of leaf damage across the three field trials (Figure 6). For example, the myb28/AOP2 and AOP2 genotypes had similar herbivory in UCD2012 (mean = 1.30 and 1.95, respectively) and UWY2011 (mean = 1.29 and 0.80, respectively) but strongly diverged in UWY2012 (mean = 1.45 and 5.64, respectively) (Figure 6 and Figure 6—source data 1). It has been shown, in a laboratory setting that the extent to which GSL profile provides resistance varies across different herbivore species (Kroymann et al., 2003; Pfalz et al., 2007; Hansen et al., 2008). In addition, GSL have been shown to provide resistance to fungi, bacteria and nematodes, which may have also been present and variable between our environments (Manici et al., 1997; Tierens et al., 2001; Aires et al., 2009; Witzel et al., 2013). It is likely that the composition of the herbivore communities differed between the two field sites. Though we did not conduct a complete survey of the herbivores present at UWY and UCD, we did observe differences in leaf damage patterns between the two locations, suggesting that there would be differences in the composition of herbivores species present. Together, these results show that GSL variation controls differential leaf damage in each field trial but the specific directions of effect for individual GSL genotypes is subject to environmental conditions, such as the composition of herbivores, which can vary temporally and spatially.

GSL variation and the environment impact fitness in the field

Since our laboratory population contains single gene variants, we have the ability to test the fitness consequences of individual genotypes in a field setting, an important step in connecting the GSL variation observed among Arabidopsis accessions with potential selective and non-selective evolutionary processes. To test if the GSL genotypes alter plant fitness in the three environments, we measured fecundity of each individual grown in the field. Plants were harvested from the field at maturity and the numbers of fruits, flowers and buds per plant were counted in the laboratory to yield total fruit count (TFC). TFC has previously been shown to be a good proxy for fecundity in Arabidopsis where total number of seeds per plant is highly correlated with total number of siliques (i.e., fruits) (Wolf et al., 2000; Kliebenstein et al., 2001c). Among the GSL genotypes we observed variation in silique length. Arabidopsis siliques contain two rows of seeds in a linear conformation, so that silique length strongly correlates with seed number at maturity, assuming uniform seed size. Therefore variation in silique length or seed size could affect our fecundity estimates. Silique length and seed size were measured from digital images of GSL genotypes harvested from the field and seed size showed no significant variation (data not shown). However, there was significant variation in silique length across GSL genotypes as well as a significant genotype by environment interaction (Table 3—source data 1). We concluded that the significant differences in silique lengths are likely reflective of fecundity and adjusted our fitness measurements using this information. Estimates of absolute fitness were therefore obtained for each individual as TFC multiplied by silique length both including and excluding individuals that did not survive to harvest. Survivorship was included in fitness estimates to avoid obtaining artificially inflated fitness estimates from GSL genotypes with higher death rates that would result from removing individuals that do not survive to fruiting and have a fitness of zero.

In this study, GSL genotype had a significant impact on absolute field fitness (Table 6). There was also a significant interaction effect between GSL genotype and environment for absolute fitness both including and excluding survivorship, suggesting that the impact that GSL genotype has on fitness is conditioned upon the environment (Table 6). Environment did not show a significant main effect on either measure of absolute fitness, suggesting that the population mean for absolute fitness may have been comparable across the environments and instead it is the fitness of GSL genotypes relative to each other within an environment that varies. Thus, these GSL genotypes that recreate natural variation within a single common genetic background influence field fitness of A. thaliana in an environmentally dependent manner.

Table 6

Mixed model of fitness phenotypes in the field

https://doi.org/10.7554/eLife.05604.036
Fixed effectsAbsolute fitness (w/survivorship)Absolute fitness (w/out survivorship)
Source of variationdfSSMSFp valuedfSSMSFp value
Genotype16455453284662.24.9E-0316397186248242.32.0E-03
Environment288326441632.22127177635884.6
Treatment1194819480.21204220420.3
Geno:Env32706781220871.70.0132508962159051.50.04
Geno:Trt1613779586120.616169036105651.0
Env:Trt2291814590.12488324420.2
Geno:Env:Trt32348291108840.83223522473510.7
Random EffectsAbsolute fitness (w/survivorship)Absolute fitness (w/out survivorship)
Source of variationdfSSMSChi.sqp valuedfSSMSChi.sqp value
Plot(Trt:Env)126402640216.50133113311279.30
Residual1692125817NA1451100617NA
Fixed effectsRelative fitnessSurvivorship
Source of variationdfSSMSFp valuedfSSMSFp value
Genotype162822.73.9E-0416504.96.9E-10
Environment2211.02219.30.01
Treatment1000.21001.2
Geno:Env323911.84.2E-0332903.83.8E-12
Geno:Trt16810.816100.8
Env:Trt2000.12000.1
Geno:Env:Trt321710.832301.1
Random effectsRelative fitnessSurvivorship
Source of variationdfSSMSChi.sqp valuedfSSMSChi.sqp value
Plot(Trt:Env)10.10.1184.501000
Residual169213.8E-04NA19000.13.5E-05NA
  1. A linear mixed model was fitted to phenotypes measured on plants grown in the field. Variation was partitioned among the fixed effects, Genotype, Environment, and Treatment as well as a random factor, Plot, inside which Treatment and Environment were nested. Phenotypic data used in the model was collected on 17 genotypes from three environments and two treatment groups. df = degrees of freedom, SS = Type II Sums of Squares variation, MS = Mean Squared variation, F = F statistic. Non-significant p values (>0.05) are represented by a dash.

To visualize if the rank in absolute fitness of GSL genotypes fluctuates among the three environments and to compare the patterns of fluctuation of GSL genotypes across environments, we plotted the mean normalized fitness of all GSL genotypes in all environments for both absolute fitness measures, including and excluding survivorship (Figure 7 and Figure 7—figure supplement 1). Absolute fitness varied greatly between the highest and lowest ranked GSL genotypes within each of the environments (Figure 7 and Figure 7—source data 1). In addition, the performance of different GSL genotypes relative to each other varied across environments, so that no GSL genotype outperformed all the others in all three environments. For example, myb28/AOP2 shows the greatest fitness in the UCD2012 environment and the lowest fitness in UWY2012. In contrast, myb28/gsoh shows an opposite pattern while other genotypes showing a diversity of other patterns (Figure 7). This fluctuation in rank of GSL genotypes across environments can also be observed if we look at fluctuations of TFC with and without survivorship across the three environments, though the patterns for specific GSL genotypes vary across the different fitness measures (Figure 7—figure supplement 1). Thus, it appears that the significant interaction of GSL genotype by environment controlling fitness is caused by fluctuations in the fitness rank of different genotypes across environments (Figure 7 and Figure 7—source data 1).

Figure 7 with 1 supplement see all
Mean normalized absolute fitness of GSL genotypes from the laboratory population grown in the field.

Shown are the mean normalized genotype averages of absolute fitness from GSL genotypes grown in all three environments calculated either including or excluding survivorship, as indicated. Absolute fitness including survivorship was calculated as total fruit count (TFC) × silique length × survivorship, whereas absolute fitness excluding survivorship was calculated as TFC × silique length for individuals that survived to harvest. Mean normalization was conducted for each phenotype by first dividing the average of each GSL genotype within an environment to the corresponding population mean for each environment. Then, each normalized value was multiplied by the grand mean across all three environments. This was done in order put the phenotype estimates in each environment on the same order of magnitude to ease visual comparisons. Solid lines represent distinct patterns that GSL genotypes display across the environments and are meant as a visual aid.

https://doi.org/10.7554/eLife.05604.037

Within an environment, individuals compete against their neighbors for resources during their lifetime and natural selection favors those with higher performance relative to others. Therefore, in addition to absolute fitness, we also analyzed the effect of the GSL genotype on relative fitness in the field, both with and without survivorship. We calculated relative fitness of each GSL genotype within each environment as absolute fitness divided the population mean within that environment. Even more strongly than with our absolute fitness measurements, we found that GSL genotype and the interaction between GSL genotype and environment both had a significant impact on relative fitness in the field both including and excluding survivorship (Table 6). For example, myb28 has a higher than average relative fitness in UWY2011 but shows an average and slightly lower than average relative fitness in UWY2012 and UCD2012, respectively (Figure 8). In other cases, relative fitness of a GSL genotype is similar among the UWY field trials but differs in the UCD field trial. Two examples, with opposite patterns are myb28/AOP2, that has low relative fitness in both UWY field trials but higher relative fitness in UCD and gsm1, that has high relative fitness in both UWY field trials but lower relative fitness in UCD. This indicates that temporal and spatial fluctuations in fitness can both occur and are dependent on genotypic differences.

Relative and absolute of GSL genotypes from the laboratory population grown in the field.

Heatmaps with hierarchical clustering of GSL genotypes representing the model corrected means of (A) absolute fitness including survivorship and (B) relative fitness of each genotype in each environment. Absolute fitness was calculated for each individual as the total fruit count × silique length × survivorship. Relative fitness was calculated by normalizing absolute fitness for each genotype against the population mean within an environment.

https://doi.org/10.7554/eLife.05604.040

Interestingly, heatmaps of absolute fitness and relative fitness reveal unexpected hierarchical clustering of the environments between the two traits (Figure 8). In both cases, UCD2012 clusters with UWY2011 and the two UWY field trials do not cluster together, showing that within an environment across years there is the potential for greater variability than across environments.

Pleiotropic links to GSL genes

In our analysis, we measured flowering time and total indole GSL in the field. In a laboratory setting, GSL genes have been observed to pleiotropically alter these traits (Kerwin et al., 2011). In the field, both of these phenotypes were significantly affected by the GSL genetic variation in our synthetic population, indicating that aliphatic GSL genes can have pleiotropic impacts beyond the aliphatic GSL pathway that can be observed in natural settings (Table 3 and Table 3—source data 1). Therefore, there is the possibility that either of these phenotypes might be driving the observed variation in fitness of GSL genotypes across these environments. To test this, we conducted genetic correlations using the genotypic means for absolute fitness, flowering time and total indole GSL within each environment (Table 7). We did not observe a significant correlation between absolute fitness and our pleiotriopic traits, using either parametric or non-parametric approaches, in any of our three environments (Table 7). This indicates that while the GSL genes are causing pleiotropic effects, these pleiotropic effects are probably not driving the observed fitness consequences of the GSL genotypes in our field trials.

Table 7

Genetic correlations between fitness and Pleitropic traits in the field

https://doi.org/10.7554/eLife.05604.041
Absolute fitnessFlowering timeTotal indole GSL
Trait (UCD2012)
 Absolute fitness−0.40−0.21
 Flowering time−0.220.06
 Total indole GSL−0.05−0.23
Trait (UWY2011)
Absolute fitness−0.24−0.27
 Flowering time−0.19−0.05
 Total indole GSL−0.430.12
Trait (UWY2012)
 Absolute fitness0.22−0.25
 Flowering time0.290.59*
 Total indole GSL−0.210.38
  1. Shown are genetic correlations between absolute fitness and traits pleiotripically controlled by GSL genes. Pearson's correlation coefficients are on the top half of the tables and Spearman rank correlations are on the bottom. *p value < 0.05, **p value < 0.01, ***p value < 0.001.

Non-random variation of GSL loci among field collected accessions

To test if natural Arabidopsis accessions show a pattern of variation consistent with fluctuating selection, we determined the GSL haplotype for a global collection of accessions using their GSL profile (Figure 2). Using the validated GSL phenotype caused by genetic variation at the eight causal genes for the aliphatic GSL pathway, we assigned a GSL haplotype to each Arabidopsis accession, given its GSL profile (Table 1 and Figure 1). Using the available GSL profile information, the underlying allelic state at each of the eight genes assigned functional or non-functional, based on presence or absence of different GSL structures as well as the relative abundances of different structures, that is, based on the GSL profile of the accession. This identified 18 distinct aliphatic GSL haplotypes among the set of 144 natural Arabidopsis accessions, observed at different frequencies (Figure 9 and Figure 9—source data 1). Using the observed single locus allelic frequencies, we calculated the expected GSL haplotype frequencies for each of the 18 multi-locus genotypes (Figure 9—source data 1). These expected frequencies for the GSL genotypes represent theoretical frequencies that would be expected if no selection gradient acted upon GSL variation and no genetic drift, migration or other non-selective effect upon population structure biased the allele distribution. Comparing the population of observed vs expected frequencies was highly non-random (p < 0.001) (Figure 9 and Figure 9—source data 1). Further, specific multi-locus GSL genotypes occurred significantly more or less often than expected (Figure 9 and Figure 9—source data 1). Thus, the non-random variation of GSL haplotypes within the Arabidopsis accessions supports the observations from the empirical field trials. It is similarly possible that this observed non-random variation is caused by non-selective processes such as migration, population structure and/or local bottleneck. Significant future efforts will be required to test the extent to which this non-random variation is caused by neutral demographic processes vs potential fluctuating selection.

GSL haplotype frequencies of Arabidopsis thaliana accessions based on GSL profile data from chamber-grown individuals.

Shown are the GSL haplotypes observed among a population of 144 Arabidopsis accessions for which our lab had existing GSL data. Seven loci important in the aliphatic GSL pathway were called based on GSL profile data from the lab as ‘+’ = functional, ‘−’ = non-functional or ‘NA’ = unobservable due to epistasis (see Figure 1 for an explanation of epistasis in the GSL biosynthetic pathway). Bar length represents the observed GSL genotype frequencies among 144 Arabidopsis accessions. Bar color represents the difference, for a given GSL haplotype, between expected and observed genotype frequencies, based on Chi Squared distribution (significant p values shown). Blue = GSL genotypes found more frequently than expected (p value ≤ 0.05), red = GSL genotypes found less frequently than expected (p value ≤ 0.05) and grey = GSL genotypes found as frequently as expected (p value >0.05).

https://doi.org/10.7554/eLife.05604.042

Discussion

Ecologically and evolutionarily important traits often show considerable phenotypic variation in nature that is quantitative, polygenic and interacts with the environment. A clear example of this is aliphatic GSL accumulation in Arabidopsis, which is highly polygenic and environmentally dependent (Figures 1, 4, 5). However, it has been complicated to validate that specific polymorphic loci within a pathway are the actual causative basis of any changes in fitness due to the use of polygenic populations (Lande and Arnold, 1983). In this study, using a single gene manipulation approach that has allowed us, over the past decade, to recreate natural allelic diversity in the aliphatic GSL pathway, we have shown that GSL genetic variation at numerous loci directly impacts Arabidopsis fitness in the field (Table 1, Figure 7, Figure 7—figure supplement 1, Figure 8). Because we have only manipulated the GSL genes within an otherwise isogenic background, we can directly conclude that it is these specific genes and their GSL phenotypes that are determining the differences in fitness in the field. Further experiments will optimally generate the full 256-line matrix containing all combinations of alleles between all loci to fully interrogate the effects of all loci in all possible backgrounds. We should also note that even with all of our efforts to clean up the respective backgrounds and validate that the mutant phenotypes are similar to the segregating natural genotypes, it remains possible that some of the observed effects are caused by unexpected changes in the lines.

More difficult however is to ascribe the specific selective forces acting on this GSL variation to produce a fitness effect. GSL are known plant defensive compounds and variation in GSL genotype was shown to significantly impact GSL profiles, leaf damage and fitness in the field (Table 3). While GSL variation did alter measured leaf damage in the field, the patterns did not fully reflect the relative fitness spectrum of these same genotypes (Figures 6–8). One possibility is that our experiment, even with 20 blocks (10 control/10 pesticide treated) per field trial, was still insufficient to identify the underlying link, suggesting the need for larger experiments. Another possibility is that there were different herbivore populations between these environments, which agrees with the observation that there was no genetic correlation of herbivore resistance across the three field trials (Table 5). The fact that different GSLs defend against different herbivores would complicate finding the specific link between GSL loci and a population of herbivores (Kroymann et al., 2003; Falk and Gershenzon, 2007; Pfalz et al., 2007; Hansen et al., 2008; Falk et al., 2014). Additionally, our herbivory measures are limited to foliar damage, which obfuscates any potential interactions between GSL genotype and root pathogens. Supporting this idea, previous studies have found that GSL can influence a number of root pathogens and commensal microbes (Bending and Lincoln, 2000; van Dam et al., 2008; Bressan et al., 2009; Millet et al., 2010; Witzel et al., 2013). While these organisms could directly impact plant fitness, this interaction is highly difficult to detect or control in field trials.

In addition to unmeasured biotic stresses, there is the potential for causal links between GSL genes and abiotic pressures. We showed that the GSL genes have pleiotropic effects on development such as flowering time that while having no link to fitness in our experiments could impact fitness in other environments. Similarly, previous work has shown that individual GSL structures directly modulate stomatal closure in response to wounding (Zhao et al., 2008). Furthermore, analysis of natural variation and validation lines showed that GSL structure and amount can influence the circadian clock and flowering time (Kerwin et al., 2011). Other experiments have also identified a potential for regulatory roles with indole GLS (Clay et al., 2009). Thus, these are not indirect pleiotropies but direct regulatory links whereby GSLs may influence the plants abiotic responses potentially to alter the biotic interactions. Thus, it is possible that the observed GSL to fitness links are resulting from a complex web of biotic and abiotic effects. Identifying the specific selective agents affected by GSL variation will require the development of techniques for rapid and systematic identification of all foliar and root herbivores and microbes from field samples as well as a complete physiological and developmental analysis of the plant within the field. This is especially critical as the specific agents of selection may be highly variable across environments.

Within our multiple field trials, we found that effects of GSL genes on fitness are highly dependent upon the environment in which the experiment is conducted (Table 6). The fitness effects of the naturally polymorphic GSL genes were such that each environment had a different optimal set of GSL genotypes (Figure 7, Figure 7—figure supplement 1, Figure 7—figure supplement 8). Similarly, no particular GSL genotype had the maximal fitness in all environments (Figure 7, Figure 7—figure supplement 1, Figure 8). This suggests that the GSL defense pathway might be a system in which genetic variation could be stabilized by fluctuating selection across the environments. Fully exploring this hypothesis will require extensive assessment of genetic variation at the polymorphic GSL loci within natural populations and more extensive field trials of this synthetic population that recreates natural diversity at these same loci.

Within species that are highly but not exclusively selfing, such as A. thaliana, temporal variation in selection is not solely sufficient to maintain genetic diversity (Dempster, 1955; Bomblies et al., 2010). This would require either spatial variation in fitness and/or variation within a seed bank to provide extra drive for the system (Dempster, 1955; Turelli et al., 2001; Turelli and Barton, 2004). Recent work has begun to show that Arabidopsis has a robust multi-generational seed bank in natural populations (Lundemo et al., 2009; Bomblies et al., 2010). Further, there is extensive allelic variation within small local regions that contain different habitats, that would likely experience different insect pressures, providing the potential for spatial variation in fitness (Bomblies et al., 2010). Thus, both conditions necessary for fluctuating selection to maintain diversity in Arabidopsis exist, but we do not yet know enough about the extent of the seed bank or spatial variation in selection within Arabidopsis to fully model the system. This shows that a greater understanding of life history traits, seed bank history and migration rates in natural populations of Arabidopsis is necessary to determine if fluctuating selection is contributing to the maintenance of variation in this species.

Conclusions

Based on our measures of fitness in the field, we showed that GSL variation can control fitness within the field. These fitness effects were not driven by pleiotropic phenotypes like flowering, but the specific selective pressures driving these fitness differences remain to be identified. Identifying these pressures will require vastly larger surveys of natural populations and long-term field trials. Using the empirical values for fitness, we could show that the GSL system within these environments fits models where fluctuating selection can maintain standing polygenic variation. Further trials are required to test if this is more broadly applicable across a broader range of environments. This would require more field trials using our synthetic population to provide the capacity to empirically evaluate models of maintenance of standing variation and its influence on adaptation (Gillespie and Turelli, 1989; Orr, 1998; Agrawal, 2001). It remains to be directly tested if similar evolutionary processes drive evolution of other ecologically important traits that must respond to fluctuating environmental conditions such as pathogen populations and water availability.

Materials and methods

Synthetic laboratory population generation

The following eight loci in the aliphatic GSL pathway were modified in the synthetic laboratory population of A. thaliana genotypes: AOP2 (At4g03060), ESP (At1g54040), MYB28 (At5g61420), MYB29 (At5g07690), GSOH (At2g25450), MAM1 (At5g23010), GSOX1 (At1g65860), GSOX3 (At1g62560). The following knockout or complementation lines for the following loci in A. thaliana Col-0 were used to generate the lab population: AOP2 = 35S:AOP2 (Li and Quiros, 2003), ESP = 35S:ESP (Burow et al., 2006), MYB28 = SALK_136312, (Sønderby et al., 2007), MYB29 = SM.34316 (Hirai et al., 2007), GS-OH = SALK_09807 (Hansen et al., 2008), MAM1 = EMS mutant line gsm1 (Haughn et al., 1991), GSOX1 = SALK_079493 (Li et al., 2008), GSOX3 = CSHL_GT13906 (Li et al., 2008). Mutant lines were manually crossed to each other to generate a population of plants containing homozogyous combinations of mutations in the different genes mentioned above, representing a subset of the potential variation in the aliphatic GSL pathway observed among Arabidopsis accessions (Table 2). Individuals were genotyped via PCR using the primers and reaction conditions listed below.

PCR primer sets and reaction conditions for genotyping

LocusPrimers sequenceGroup
MYB29 genemyb29-1 RP 5′-TATGTTTGCATCATCTCGTCTTC-3′1
myb29-1 LP 5′-TTGTAGATTGCGATGGGCTA-3′
MYB29 T-DNAmyb29-1 RP 5′-TATGTTTGCATCATCTCGTCTTC-3′1
myb29-1 LB 5′-ATATTGACCATCATACTCATTGC-3′
AOP2 geneAOP2 FOR ODD13 5′-AACAGCGAAACGATCCAGAAGA-3′1
AOP2 REV ODD24 5′-GTGCTTCTCGTCCACAA-3′
MAM1 genegsm1-2 FOR 5′-TCATCGCTTCTGACATCTTCC-3′1
gsm1-2 REV 5′-GTCTTGGCGATGGTCTTAATG-3′
GSOX3 genegsox3 RP (P3P) 5′-TCGTCCTGACAAGACTGCTG-3′2
gsox3 LP (P3P) 5′-GAGGGTCCAGTCGAAAAACTC-3′
GSOX3 T-DNAgsox3 RP (P3P) 5′-TCGTCCTGACAAGACTGCTG-3′2
LB2 5′-GCTTCCTATTATATCTTCCCAAATTACCAATACA-3′
GSOH geneGSOH RP1 5′-GCTTCGGGATTAGGAGGAAC-3′2
GSOH LP 5′-ATGAAGATTGGCGTGAAAGG-3′
GSOH T-DNAGSOH RP1 5′-GCTTCGGGATTAGGAGGAAC-3′2
LBb1.3 5′-ATTTTGCCGATTTCGGAAC-3′
GSOX1 genegsox1 RP (P3P) 5′-CTAGCGCGGGTAGAAAGACAT-3′3
gsox1 LP (P3P) 5′-GCATTCCAAAAATACCATAACG-3′
GSOX1 T-DNAgsox1 RP (P3P) 5′-CTAGCGCGGGTAGAAAGACAT-3′3
LB2 5′GCTTCCTATTATATCTTCCCAAATTACCAATACA-3′
MYB28 genemyb28-1 RP 5′-TGTATAAACCAGCTTTTTGGGG-3′3
myb28-1 LP 5′-TTTTTCATTATGCGTTTGCAG-3′
MYB28 T-DNAmyb28-1 RP 5′-TGTATAAACCAGCTTTTTGGGG-3′3
LBa1 5′-TGGTTCACGTAGTGGGCCATCG-3′
Reaction conditions for group 1
Initial melting32 cyclesFinal extension
94°C94°C60°C72°C72°C
30 s30 s45 s90 s10 min
Reaction conditions for group 2
Initial melting30 cyclesFinal extension
94°C94°C61°C72°C72°C
30 s30 s45 s90 s10 min
Reaction conditions for group 3
Initial melting30 cyclesFinal extension
94°C94°C65°C72°C72°C
45 s45 s45 s90 s10 min

Experimental settings

Field trials were conducted in two locations, the latter over 2 years, giving three separate environments total. The first field trial was performed at the University of Wyoming (UWY) in Laramie, WY during Summer 2011, the second at UC Davis in Davis, CA Spring 2012, and the third at UWY during Summer 2012. Seeds were sown into flats with 2 inch 50-celled inserts using Sunshine #5 (Sungro, Agawam, MA) potting soil containing slow release fertilizer and stratified at 4°C for 4 days before being transferred into the greenhouse at either the University of Wyoming in Laramie (UWY) or the University of California at Davis (UCD) to facilitate germination synchrony. In the UWY greenhouse, plants received 15 hr light/9 hr dark natural phototoperiod with temperatures fluctuating diurnally from 10°C to 30°C. In the UCD greenhouse, plants received 14 hr light/10 hr dark natural photoperiod with temperatures fluctuating from 15°C to 35°C. Further, starting all the plants in the greenhouse minimizes variation in the initial seedling conditions. After germination, seedlings were thinned to one per pot and GSL genotypes were randomly organized into 40 blocks per field trial, for a total of 120 blocks total and also 120 GSL genotype replicates total. Individuals were transplanted from the greenhouse into the field 2 weeks post germination. A single plant of each genotype was present in each block in all three environments and blocks were arranged into four rows of ten blocks each (Figure 3). Each row of 10 blocks is referred to as a plot, so that there were four plots per field trial and 12 plots total. Within each plot is nested a treatment by environment combination. Every 14 days, two plots (20 blocks total) per environment were treated with pesticides to decrease leaf damage due to herbivory. At UWY, plants were sprayed with the insecticide Sevin (GardenTech, Palatine, IL) to repel flea beetles. At UCD, plants were treated with Marathon 1% granular (OHP, Mainland, PA) and Lily Miller Slug, Snail & Insect Killer Bait (Lily Miller Brands, Walnut Creek, CA). The plants were allowed to grow in the field for 4 weeks before being harvested. At harvest, the aerial portion of each plant was collected from the field, placed into a quart sized freezer bag and transferred into 4°C for temporary storage. After the harvest completion, the UCD field plants were immediately placed into −80°C for storage. The UWY field plants were shipped to UC Davis overnight on dry ice and then placed in −80°C for storage.

GSL extraction, HPLC separation and GSL structure identification

GSL were measured on all field trial plants to assess field effects of the genotypes on GSL accumulation. At approximately 4 weeks of age, a single, fully expanded, green leaf was collected from each plant. In order to measure leaf area of each sample, leaves from twelve plants at a time were placed on a white sheet of paper with a grid overlay. A ruler was placed on the sheet of paper below the leaves and digitally imaged using a Nikon D3100 (Tokyo, Japan). The photographed leaves were then placed directly into 96 deep well plates containing 400 μl 90% methanol and stored in the freezer until extraction. For the UWY field trial, the leaves were stored at −20°C for 3–4 weeks and shipped overnight to Davis, CA on dry ice. For the Davis field trial, all plates were stored at −20°C until extraction. After harvest, desulfoglucosinolates were extracted from all samples using a high-throughput protocol briefly described below (Kliebenstein et al., 2001a). One gram of Sephadex DEAE A-25 (Sigma–Aldrich, St. Louis, MO) was added to each well of a 96 well filter plate using a column loader. To hydrate the Sephadex, 300 μl of H2O was transferred to each well using a multichannel pipet and allowed to incubate at room temp 1 hr. Excess H2O was removed from the Sephadex by placing filter plate on top of a 96 deep well discard plate (used to catch the flow through) and centrifuged at 1000 rpm for 2 min. To homogenize the samples, 96 deep well plates containing a single A. thaliana leaf, two 2.3 mm ball bearings and 400 μl of 90% methanol in each well were shaken in a Harbil 5-Gallon Mixer (Fluid Management Co., Wheeling, IL) for 3–5 min. Plates were centrifuged at 4000 rpm for 20 min. To bind GSL to Sephadex, 150 μl of supernatant from each well (containing the extracted organic compounds) was transferred to the corresponding well of the 96 well filter plate containing hydrated Sephadex and centrifuged at 1200 rpm for 3 min on top of the 96 deep well discard plate. To wash away all the non-binding organic compounds from the Sephadex, 150 μl of 90% methanol was added to each well and the plate was centrifuged at 1200 rpm for 3 min. To remove excess methanol, two wash steps were conducted by adding 150 μl of H2O to the plate followed by centrifugation at 1200 rpm for 3 min. To release the GSL from the Sephadex, 10 μl of Sulfatase (Sigma–Alrich) and 100 μl of water were added to each well of the 96 well filter plate then incubated overnight in the dark. The desulfoglucosinolates were then eluted into a 96 well round bottom plate via centrifugation at 1200 rpm for 3 min. For each GSL sample, 50 μl of the 110 μl of extract was injected on an Agilent 1100 HPLC (Agilent, Santa Clara, CA) using a Lichrocart 250–4 RP18e column (Hewlett–Packard, Palo Alto, CA). Individual GSL compounds were detected at 229 nm and separated utilizing the following program with an aqueous acetonitrile gradient: a 6-min gradient from 1.5% to 5.0% (vol/vol) acetonitrile, followed by a 2-min gradient from 5% to 7% (vol/vol) acetonitrile, a 7-min gradient from 7% to 25% (vol/vol) acetonitrile, a 2-min gradient from 25% to 92% (vol/vol) acetonitrile, 6 min at 92% (vol/vol) acetonitrile, a 1-min gradient from 92% to 1.5% (vol/vol) acetonitrile, and a final 5 min at 1.5% (vol/vol) acetonitrile (Kliebenstein et al., 2001a). For each peak, the GSL structure was determined by comparing the retention time and UV absorption spectrum with known standards. The integral under each peak was automatically calculated and this value in mili-absorption units was converted to picamoles/mm2 tissue using response factor slopes determined from purified standards and area of leaf tissue used per sample as measured by ImageJ (Kliebenstein et al., 2001a; Reichelt et al., 2002).

Leaf damage measurements in the field

Leaf damage estimates were visually taken in the field at 4 weeks of age, just before tissue collection for GSL extraction. A scale from 0–10 was used to determine amount of pest damage incurred on each plant, with 0 representing no damage and 10 representing complete destruction of the plant (i.e., the plant completely eaten).

Absolute fitness and relative fitness

Absolute fitness was calculated as total fruit count (TFC) × silique length × survival. TFC was measured as the count of fruits (siliques) + flowers + buds per individual. Silique length was measured in ImageJ from digital images of harvested field plants taken using a Nikon D3100 as follows: each plant was placed flat on a white sheet of paper next to a ruler and pictures were taken using auto focus. After setting the scale in ImageJ using the ruler placed in each image, the segmented line tool was used to draw a line from the pedicle to the tip of the silique. For each plant, eight siliques were measured at random and these values were averaged to get a value for each plant. Survival was scored on a binary (0–1) scale. Plants that germinated, were transplanted into the field and survived to harvest were given a survival score of 1 and plants that germinated and were transplanted but did not survive to harvest were given a score of 0. Individuals that did not germinate or did not survive to transplantation were given an NA. Relative fitness was calculated for each GSL genotype within each environment relative to Col-0. To do this, average absolute fitness of a GSL genotype was divided by the average absolute fitness of Col-0 within a environment. Col-0 was chosen as the reference genotype given that it is the background genotype.

Statistical analysis methods

All statistical analyses were carried out using the R statistical computing language (Team, 2014). The field trial was conducted in a split plot design with each plot nested within treatment by environment. We used a restricted maximum likelihood (REML) approach to fit a linear mixed effects model to the field traits and partition the variation of each among the fixed effects, genotype, environment, treatment and the random factor, plot nested within treatment and environment. There were 17 genotypes, which refers to the GSL genotype in the synthetic laboratory population. There were three environments: Wyoming 2011, Wyoming 2012 and Davis 2012. The two treatments were control and pesticide treated. We had 4 plots per environment (2 in each treatment group) for a total of 12 plots. We used the following formula to fit this model using the lme4 package in R (Baayen et al., 2008):

lmer(Trait ∼ Genotype*Environment*Treatment + (1|Plot(Treatment:Environment))).

The Anova function from the car package in R was utilized to determine which fixed effects variables significantly altered the mean of each trait (p value <= 0.05) (Fox and Weisberg, 2011). We estimated population means (i.e., LSMeans) of each field trait for all genotypes across treatment and environment using the LSMeans function from the doBy package in R (Højsgaard et al., 2013). Dunnett's multiple comparison testing was performed on the traits to determine which genotypes had significantly different means than Col-0, our reference genotype using the glht function from the multcomp package in R (Hothorn et al., 2014). Additionally, Tukey's multiple comparison was performed on the traits to compare all the genotypes to all the other genotypes for significant differences using the same glht function from the multcomp package in R (Hothorn et al., 2014). PCA was conducted using the princomp function from the base package (Team, 2014).

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
    On the origin of species by means of natural selection, or the preservation of favoured races in the struggle for life
    1. C Darwin
    (1859)
    London: John Murray.
  21. 21
  22. 22
  23. 23
    Maintenance of genetic heterogeneity
    1. ER Dempster
    (1955)
    Cold Spring Harbor Symposia on Quantitative Biology 20:25–32.
    https://doi.org/10.1101/SQB.1955.020.01.005
  24. 24
    Neutral theory: the null hypothesis of molecular evolution
    1. L Duret
    (2008)
    Nature Education 1:218.
  25. 25
  26. 26
  27. 27
  28. 28
    An {R} companion to applied regression (2nd edition)
    1. J Fox
    2. S Weisberg
    (2011)
    Thousand Oaks, CA: Sage.
  29. 29
    Genotype-environment interactions andthe maintenance of polygenic variation
    1. JH Gillespie
    2. M Turelli
    (1989)
    Genetics 121:129–138.
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
    doBy–groupwise summary statistics, LSmeans, general linear contrasts, various utilities
    1. S Højsgaard
    2. U Halekoh
    3. WC From
    4. J Robinson-Cox
    5. KM Wright
    et al. (2013)
    doBy–groupwise summary statistics, LSmeans, general linear contrasts, various utilities.
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
    Data from: field evidence that fluctuating selection can maintain natural genetic variation in Arabidopsis thaliana defense
    1. RE Kerwin
    2. J Feusier
    3. M Rubin
    4. JA Corwin
    5. C Lin
    6. A Muok
    7. B Larson
    8. B Li
    9. B Joseph
    10. M Francisco
    11. D Copeland
    12. C Weinig
    13. DJ Kliebenstein
    (2012)
    Dryad Digital Repository, 10.5061/dryad.8qp37/1.
  44. 44
  45. 45
  46. 46
    Comparative quantitative trait loci mapping of aliphatic, indolic and benzylic glucosinolate production in Arabidopsis thaliana leaves and seeds
    1. DJ Kliebenstein
    2. J Gershenzon
    3. T Mitchell-Olds
    (2001a)
    Genetics 159:359–370.
  47. 47
  48. 48
  49. 49
    Comparative analysis of quantitative trait loci controlling glucosinolates, myrosinase and insect resistance in Arabidopsis thaliana
    1. DJ Kliebenstein
    2. D Pedersen
    3. B Barker
    4. T Mitchell-Olds
    (2002)
    Genetics 161:325–332.
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
    Both naturally occurring insertions of transposable elements and intermediate frequency polymorphisms at the achaete-scute complex are associated with variation in bristle number in Drosophila melanogaster
    1. AD Long
    2. RR Lyman
    3. AH Morgan
    4. CH Langley
    5. TF Mackay
    (2000)
    Genetics 154:1255–1269.
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
  70. 70
  71. 71
    Fitness impacts of herbivory through indirect effects on plant-Pollinator interactions in Oenthera macrocarpa
    1. K Mothershead
    2. RJ Marquis
    (2000)
    Ecology 81:30–40.
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
    Molecular population genetics of Floral homeotic loci: Departures from the equilibrium-neutral model at the APETALA3 and PISTILLATA genes of Arabidopsis thaliana
    1. MD Purugganan
    2. JI Suddith
    (1999)
    Genetics 151:839–848.
  80. 80
  81. 81
  82. 82
    Perspectives on ecological and evolutionary systems biology
    1. CL Richards
    2. Y Hanzawa
    3. MS Katari
    4. IM Ehrenreich
    5. KE Engelmann
    6. MD Purugganan
    (2009)
    Annual Plant Reviews 35:331–351.
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88
    R: a language and environment for statistical computing
    1. Team RC
    (2014)
    Vienna, Austria: R Foundation for Statistical Computing.
  89. 89
  90. 90
  91. 91
  92. 92
  93. 93
  94. 94
  95. 95
    Heterogeneous selection at specific loci in natural environments in Arabidopsis thaliana
    1. C Weinig
    2. LA Dorn
    3. NC Kane
    4. ZM German
    5. SS Halldorsdottir
    6. MC Ungerer
    7. Y Toyonaga
    8. TF Mackay
    9. MD Purugganan
    10. J Schmitt
    (2003)
    Genetics 165:321–329.
  96. 96
  97. 97
  98. 98
  99. 99
  100. 100

Decision letter

  1. Merijn R Kant
    Reviewing Editor; University of Amsterdam, Netherlands

eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.

Thank you for sending your work entitled “Natural Genetic Variation in Arabidopsis thaliana Defense Metabolism Genes Modulate Field Fitness” for consideration at eLife. Your article has been favorably evaluated by Ian Baldwin (Senior editor) and three reviewers, one of whom is a member of our Board of Reviewing Editors.

The following individuals responsible for the peer review of your submission have agreed to reveal their identity: Merijn Kant (Reviewing editor) and Michael Turelli (who was consulted by the referees for advice). The other two reviewers remain anonymous.

The Reviewing editor and the other reviewers discussed their comments before we reached this decision, and the Reviewing editor has assembled the following comments to help you prepare a revised submission.

Kerwin and colleagues created a range of glucosinolate mutants in a single genetic background of Arabidopsis. These mutants were designed such that in each mutant line distinct branches of the glucosinolate pathway were removed selectively and these predicted phenotypes were validated. Then they composed variable populations by transplanting these mutants together at several field sites during several years to monitor several fitness proxies. The manuscript revolves around the question to which extent specific genes can affect fitness under different environmental conditions differently. The value of this work is that it provides a comprehensive assessment on known genes and fluctuating patterns of herbivory and fitness in nature. Hence, Kerwin and colleagues argue that that fluctuating selection may maintain polymorphisms in glucosinolate biosynthesis genes in natural populations.

The authors extend beyond their data after reasoning that these different environments basically reflect environmental fluctuations and they use their data to parametrize a model of Turelli and Barton (2004) in order to find indications that fluctuating selection could be responsible for maintaining allelic variation in natural Arabidopsis populations and their analysis suggests that it does. In addition, they collected 144 Arabidopsis accessions to assess if the signature of fluctuating selection can be seen among these accessions as well and this analysis shows non-random variation of glucosinolate haplotypes suggesting that fluctuating selection may maintain genetic variation here as well.

We all agreed that the manuscript reads very well and that the data set is extremely interesting and comprehensive. However, we also have significant concerns with respect to the usage of the model of Turelli and Barton (subsection “Fluctuating selection estimates”) and the interpretation of the analysis performed on the 144 accessions (in the subsection “Non-random variation of GSL loci among field collected accessions”).

1) Firstly, the model of Turelli and Barton cannot be used for this type of data. During our discussion we consulted Michael Turelli directly and together we reached the following conclusion: although these data certainly warrant a discussion on fluctuating selection to play a role here, there is no simple formula to provide the appropriate polymorphism condition in this case, to the best of our knowledge. The detailed comments of Turelli are below and can form an excellent basis for a revised discussion on known genes and fluctuations in herbivory and fitness in nature (we were missing Prasad et al., 2012, 337:1081 in Science, which during the discussion came up as one of the few, if not only, other examples). Furthermore we suggest you make an estimate of the genetic correlation between environments for levels of herbivore resistance among these genotypes. This will be informative, and should not be difficult to calculate from the existing data and will bring depth into the Discussion. Please take this discussion into account that glucosinolates can have effect on fitness also via other effects than herbivore damage alone.

2) Secondly, we all agreed that the interpretation of the analysis on the 144 field collected accessions is too opportunistic. Non-random variation of glucosinolate loci among natural accessions could be caused by fluctuating selection but there are equally valid alternatives (not mutually exclusive) such as drift, gene flow or historical population structure etc. Now the conclusions worded in the subsection headed “Non-random variation of GSL loci among field collected accessions” are much too selectively biased towards the first. We strongly suggest you rephrase the interpretation thoroughly doing just the alternatives or, and this may be the better alternative, to remove this analysis from your study altogether.

We hope that this letter, the minor comments of the referees and the report of Dr. Turelli will help you to revise your manuscript for eLife.

Please find below detailed comments of Michael Turelli in response to the manuscript and the discussion among the referees:

“This is indeed deep water. Given the technical nature of the relevant theory it should be no surprise that both the authors and the reviewers make incorrect assertions, but both make important points.

First, the relevance of the Turelli and Barton (2004) conditions to maintenance of variation involving genotypes created from “mutants [involving] two or three mutated alleles (different loci) [brought] together to get the desired phenotype.” The Turelli and Barton (2004) conditions involve the effects of alleles at individual loci. Hence, one cannot invoke conditions for multilocus genotypes. As I understand this is a key criticism of the reviewers (“the locus effects reported here are confounded by the effects of other loci that are in disequilibrium in this experimental population”). This criticism is correct, but see below.

Second, the relevance of the Turelli and Barton (2004) conditions to any genetic variants in Arabidopsis thaliana, which is predominantly selfing. Turelli and Barton (2004) explicitly assume random mating. If Arabidopsis thaliana were completely selfing, which I believe is reasonable as a first approximation, populations would be effectively composed of competing clonal genotypes (the referees mention here that Arabidopsis is less clonal than indicated by Turelli, otherwise genome-wide association studies would be impossible). In this case, the distinction between variants at one locus vs multiple loci is irrelevant, as is the distinction between diploidy and haploidy. Hence, the diploid-random-mating conditions provided by Turelli and Barton (2004) are irrelevant to the maintenance of variation. In general, temporal fluctuations alone cannot maintain variation for a haploid, the genotype with the highest geometric mean fitness will prevail (this idea goes back to Dempster 1955, cited in Turelli et al. 2001, Evolution 55:1283-1298, which explicitly deals with the maintenance of a famous flower-color polymorphism). For such populations, one must invoke either spatially varying fitness or temporal variation with a seed bank. The exact conditions for the maintenance of variation will depend on the nature of gene flow between patches with alternative selection regimes and the extent to which the seed bank creates overlapping generations. These issues are discussed by Turelli et al. (2001).

The conditions provided by Turelli and Barton (2004) cannot be directly applied to the system of Kliebenstein system. However, showing that different genotypes are favored under different conditions does indeed suggest that fluctuating selection (in time and/or space) may contribute to the maintenance of this polymorphism. The exact mathematical conditions relevant to this system have probably not been worked out. However, assuming near-complete selfing, maintenance of variation will require overlapping generations via a seed bank (I'm not sure if this is relevant to A. thaliana) and/or spatial variation with gene flow. The exact conditions will be subtle and depend on biological details that are surely not known.”

Reviewer #2 minor comments:

1) Abstract: “fitness effects were significant in each environment but the pattern fluctuated such that highly fit alleles in one year displayed lower fitness in another.” This experiment doesn't compare alleles, it compares multilocus genotypes.

2) In the subsection headed “Environment interacts with GSL genotype to impact leaf damage”: “no particular GSL genotype showed a consistent maximal or minimal level of leaf damage across the three field trials”. This conclusion that no genotype has consistently highest or lowest damage level obscures the highly significant main effect of genetic differences in damage levels (Table 3).

3) Discussion: “it has been complicated to validate that specific polymorphic loci within a pathway are the actual causative basis of any changes in fitness due to the use of polygenic populations”. Unfortunately, this study shares the same shortcoming, because the chosen combination of multilocus genotypes prevents a clear test for antagonistic pleiotropy at individual loci.

4) Discussion: “we can directly conclude that it is these specific genes and their GSL phenotypes that are determining the differences in fitness in the field”. True, but that is not sufficient to show that fluctuating selection maintains genetic variation at any particular locus.

5) In the subsection “Statistical analysis methods”: Each plot has 10 blocks. Why is there is no block term in the ANOVA?

6) Fitness is normalized with respect to performance of Col-0. This is contrary to the usual definition of relative fitness (with a mean of 1.0). Because of this normalization to Col-0, the reported effects on relative fitness may reflect behavior of Col-0 rather that population means.

7) What does Figure 8 tell us, beyond the main and interaction effects already reported in Table X?

8) In the subsection headed “Pleiotropic Links to GSL Genes”: “while the GSL genes are causing pleiotropic effects, these pleiotropic effects are not driving the observed fitness consequences of the GSL genotypes in our field trials.” A simple pairwise correlation analysis is not sufficient to support this conclusion. More complex ANCOVA-like models might be helpful here, perhaps with principal components of GSLs.

Reviewer #3 minor comments:

1) Introduction: this sentence gives the impression that intermediate frequency variants are frequent in the genome. This is misleading, because the majority of variants do have low frequency. It is just that there are several intriguing examples of intermediate frequency alleles. Furthermore, people have tried to relate FRI to fitness and failed in most cases.

2) Also in the Introduction: I wonder why the authors do not mention the work of Fournier-Level et al. 2011, which does relate underlying genes to fitness.

3) In the subsection “Leaf damage in the field varies across environments”: pesticide application is somehow unfortunate because it artificially decreases herbivory load and thus might explain why the expected effect of GLS variation on fitness via herbivore defense was not observed. This must be better incorporated in the Discussion.

4) In the subsection “Fluctuating selection estimates”: I really appreciate that the authors include now the model of Turelli and Barton, but the model and its predictions should be briefly summarized (for the reader this will come out of the blue!). More explanations are needed in the Methods as well. Did the authors use LSmeans to calculate the parameters? Were LSmeans calculated after correction of error over dispersion?

5) In the subsection “Non-random variation of GSL loci among field collected accessions”: this formulation is somehow strange. The words “population structure” should be mentioned so that the right bell rings for the reader.

6) In the same subsection: the authors should be clearer here that this analysis cannot exclude the possibility that the skewed haplotype distribution can result from population structure only.

7) Discussion: this aspect of the Discussion is still too uncritical for me. The authors should not forget that their findings are not expected and may be driven by processes that were not supposed to play a role. What about insertion effects? Inserted transgenes can disrupt other genes and may result in fitness effects that are unrelated to GLS function. EMS lines initially contain thousands of mutations. They are generally removed by multiple generation of backcrossing to the wild type. Crossing adds some additional backcrossing, is it possible that lines differ in the number of linked mutations? Finally, transgenes are not always stably expressed and could be silenced in one or the other generation, especially if several tDNA insertion lines are coupled. Such problems are well known to (good) molecular biologists but often overseen by ecologists. I believe it is important to critically assess the possibility that the manipulated lines may not be doing exactly what they are supposed to do.

[Editors’ note: an earlier version of this manuscript was also reviewed. The previous decision letter after peer review is shown below.]

Thank you for choosing to send your work entitled “Field Evidence that Fluctuating Selection Can Maintain Natural Genetic Variation in Arabidopsis thaliana Defense” for consideration at eLife. Your full submission has been evaluated by Ian Baldwin (Senior editor) and three peer reviewers, one of whom, Merijn Kant, is a member of our Board of Reviewing Editors, and the decision was reached after discussions between the reviewers.

The referees appreciated the impressive field experiment and are convinced it has delivered a highly valuable data set that deserves publication. However, the current manuscript needs too much work to warrant a revision for eLife. The referees were predominantly concerned about three issues: (1) the framing of the story around the fluctuating selection-theme inferred from the genotype distribution of 144 Arabidopsis accessions; (2) the statistical analyses of the field experiment data and (3) on how the fitness proxies came about. I will briefly summarize the main comments which you will also find in more detail in the full referee reports:

1) The referees feel that “fluctuating selection/ bet hedging” theme should only (at best) appear in the Discussion and certainly not be a central theme. Much of this has to do with the analysis presented in Figure 2 using the 144 Arabidopsis accessions for drawing conclusions on natural selection as a determinant of their genotype distribution.

2) The ANCOVA analysis using the 144 accessions and its interpretation is troublesome, and maybe should be removed altogether. Several of the remaining statistical analyses (Figures 4-8) should be adjusted with respect to model structure, survival and zeroes and multiple testing.

3) The validity of the fitness-proxy parameters should be better justified.

Reviewer #1:

General assessment:

The field experiment is really very interesting and has delivered a valuable data set. I disagree with several aspects of the analysis pipeline and hence I do not see sufficient support for the main conclusions. The article is not easy to read since phrasing is often vague and imprecise.

My main criticisms:

1) The projection of the experimental data onto the genotype-distribution of 144 Arabidopsis accessions does not work for me:

A) I fail to see why you would expect to see evidence for natural selection within a group of plants (accessions) of which you do not know what the original selection criteria were: for sure these are not ecotypes and it is unclear to which extent they represent the genotypes of their respective original populations (Weigel, 2012, Plant Physiol: 158:2-22). This should at least be discussed in much more detail.

B) The analysis presented in Figure 2 that justifies the field experiment (in the subsection headed “Structured population mimicking natural GSL variation in Arabidopsis”) is not strong since a test for the goodness of fit of the overall observed vs expected relative frequency distributions is missing (e.g. chi square test for goodness of fit).

C) How does the analysis of the Figure 2 data exclude non-selective processes as an alternative for random assortment (I think neither of these are mutually exclusive).

D) You write “field studies confirm lab results” (at the end of the subsection “GSL genetic variation controls GSL profile in the field”), but you do not provide a direct comparative analysis, just a 'visual' interpretation of the data.

E) In the subsection “Empirical fluctuating measures of selection in the field predict standing variation in GSL genes”, you conclude that “selection likely played a role”, but you do not explain where in the figures we can see this. Looking at the three plots (Figure 9), I only see that the correlation between frequency and your relative fitness-proxy is weak (probably due to UWY2012).

2) You used a “deterministic” approach to genotype the 144 accessions i.e. on the basis of their “GSL profiles” but a validation of this approach is missing:

A) Throughout the manuscript is remains unclear what is meant with “profile” and how these were evaluated (e.g. which grouping criteria/procedures/assignment to a genotype). Hence its validity cannot be assessed.

B) In the subsection “GSL genetic variation controls GSL profile in the field”, it is unclear to which extent actual profile information was used for the downstream analyses or when only the information on the total (aliphatic) glucosinolates was used (e.g. see Figures 4 and 5). Figure 4 appears to represent these “profiles” but a statistical evaluation is not provided.

C) Why was this indirect approach for genotyping preferred over a direct (DNA-based) approach?

3) The factors of the statistical analyses are often unclear:

A) The factor “location” is misleading: it should be “environment” since the different locations were used at different moments in time. Why do you refer to these differences as 'fluctuation' i.e. how do you know they are not caused solely by differences in starting conditions? Now you infer these differences from Figure 5 but the patterns across the three panels should be statistically evaluated to decide to which extend they differ.

B) The statistical “interaction” is misinterpreted and dance around its meaning. Significant interactions indicate that their simultaneous effects are not additive i.e. either the combined effect is greater (synergistic) or smaller (antagonistic) than expected (additive) effect. Pinpointing what they mean is sometimes virtually impossible and requires post hoc statistics.

C) The ANCOVA procedure is not explained anywhere.

4) Your fitness proxies, and especially how they were normalized, need validation. In “GSL variation impacts fitness in the field”, you describe a normalization procedure (which assumes a linear relationship between silique length and fecundity) which struck me as highly arbitrary. This procedure needs references or a solid validation.

Reviewer #2:

This manuscript describes the performance in the field of 20 lines differing only in 8 loci controlling glucosinolate synthesis. The author used 20 lines combining one or more variant at these loci. They planted them in a random block design, in 40 replicates, in two consecutive years in WY and in 2012 in UC Davis. In each experimental block, the level of herbivory was manipulated with the help of pesticides.

They monitored the glucosinolate profile of these lines in the field and observed that these profiles broadly match with those observed in the lab. They further quantified leaf damage and demonstrated that the genetic differences between these lines alter leaf damage levels. Finally, the authors present evidence that the fitness of individual lines differ within and between sites/years.

In order to address the relevance of the specific observations they made at these two sites, the authors also used known glucosinolate levels to predict the relative occurrence of the glucosinolate profiles in a set of 144 natural accessions. They report that the lines that have the highest fitness in one of their field site/year tend to be the ones whose genotype is overrepresented in the natural population.

This work is unique: the system is very well known, the genetic differences between the lines alter glucosinolate production in the field and the match between observed frequencies and fitness is very exciting. To my knowledge, this is the first time that the fitness consequences of an extent molecular system is studied in the field. This work demonstrates that glucosinolate production is important for fitness. I find equally interesting that this fitness relevance cannot be explained by insect damage. This shows that the ecological function of glucosinolate is not as straightforward as one may think. Finally, the authors document the complexity of field studies in A. thaliana where variation across year and location dramatically alter fitness levels.

This work is excellent and I strongly recommend it for publication in eLife.

Reviewer #3:

This manuscript examines herbivory and fitness in field sites in the USA for a series of transgenic glucosinolate genotypes, and compares these local fitness estimates to the frequency of genotypes in 144 European Arabidopsis accessions. The authors conclude that the relationship between field fitness and observed glucosinolate genotype frequencies is due to fluctuating selection pressures that maintain genetic variation in nature. The questions are of great ecological and evolutionary significance, but these conclusions are compromised by statistical and evolutionary shortcomings.

Statistics:

1) This is a split plot design (for insecticide treatment), not a randomized complete blocks design. Correction of this analysis will certainly affect inferences related to the insecticide treatment, and may alter other parts of the model.

2) Inclusion of individuals with zero fitness is not compatible with the distributional assumptions of this ANOVA. This should be visible in the residuals, although no statement is made regarding such verification of statistical assumptions. Such zero-inflated data pose a difficult problem in such analyses, which typically alter levels of statistical significance and often cause spurious rejection of the null hypothesis.

3) Controls for multiple statistical tests are needed at several points in this manuscript. Examples include tests for non-random distribution of multilocus GSL genotypes (Table 2) comparison of treatment effects, and variation of GSL among sites (Figure 5).

Evolution and Genetics:

4) The evolutionary significance of this study is justified as a test whether fluctuating selection maintains genetic variation within and among populations. However, these experiments do not estimate herbivory or fitness for the individual GSL loci, and they do not show change in rank fitness at putatively selected loci, which is a necessary condition for balancing selection to maintain non-neutral genetic polymorphism.

5) Even if the patterns in Figure 8 were due to selection, they might be attributable to non-equilibrium directional selection in Eurasia, rather than to historical balancing (fluctuating) selection. Consequently, this analysis cannot prove that such patterns are due to fluctuating pressures maintaining standing natural variation within a species.

6) The observation that multi-locus genetic variation controlling aliphatic GSL appears to be non-randomly distributed among the natural accessions is interpreted as evidence for natural selection. However, this may also result from population structure, nonrandom geographic sampling, finite population size, or failure to correct for multiple tests.

7) “We observed significant variation in silique lengths across genotypes” How do GSL polymorphisms alter silique length? And, how do these effects differ among pesticide treatments? Alternatively, rather than the effects of GSL polymorphisms, variation in silique length (and fitness) may be due to position effects of the transgene inserts, or untagged Agrobacterium hits, or linked mutations not eliminated following EMS.

8) Several points weaken the proposed role of herbivory in shaping the observed patterns in GSL polymorphisms:

8A) In Figure 6 we see that the ranking of herbivore damage (without insecticide) decreases from UWY2012 > UCD2012 > UWY2011. However, the correlations in Figure 12 show the opposite pattern: UWY2012 < UCD2012 < UWY2011. This seems incompatible with the conclusion that the herbivory differences in the field reflect natural selection by herbivores in Eurasian Arabidopsis over thousands of generations.

8B) Similarly, the authors note that “the positive correlation between observed genotype frequency and fitness disappeared in the high herbivory WY2012 field trial.” This suggests that these results may not be due to herbivory.

8C) To test for herbivory-mediated effects, one could ask whether the ANCOVA is significant if only the no-pesticide treatment is analyzed. Or, what happens to the patterns in Figure 8 if the change in fitness between insecticide treatments is used as the response variable for the ANCOVA?

https://doi.org/10.7554/eLife.05604.044

Author response

We all agreed that the manuscript reads very well and that the data set is extremely interesting and comprehensive. However, we also have significant concerns with respect to the usage of the model of Turelli and Barton (subsection “Fluctuating selection estimates”) and the interpretation of the analysis performed on the 144 accessions (in the subsection “Non-random variation of GSL loci among field collected accessions”).

1) Firstly, the model of Turelli and Barton cannot be used for this type of data. During our discussion we consulted Michael Turelli directly and together we reached the following conclusion: although these data certainly warrant a discussion on fluctuating selection to play a role here, there is no simple formula to provide the appropriate polymorphism condition in this case, to the best of our knowledge. The detailed comments of Turelli are below and can form an excellent basis for a revised discussion on known genes and fluctuations in herbivory and fitness in nature (we were missing Prasad et al., 2012, 337:1081 in Science, which during the discussion came up as one of the few, if not only, other examples). Furthermore we suggest you make an estimate of the genetic correlation between environments for levels of herbivore resistance among these genotypes. This will be informative, and should not be difficult to calculate from the existing data and will bring depth into the Discussion. Please take this discussion into account that glucosinolates can have effect on fitness also via other effects than herbivore damage alone.

We have removed the sections using the Turelli approach and we have included a new section in the Discussion commenting on what would be further required to allow for fluctuating selection to maintain diversity in Arabidopsis (i.e. seed bank, etc) and provided citational support for these requirements in Arabidopsis within the field. Finally, we have concluded this section with a comment on the dramatic need for further field trials and natural history studies to test for this potential in the field.

We apologize for having not included the Prasad citation as we had been focusing on single gene manipulations. We have now included this citation.

We have now included the genetic correlation of herbivory across the three environments that shows that there is no correlation which agrees with the concept of fluctuating herbivory pressures. We would also like to note that we have an entire paragraph in the Discussion (third paragraph) on glucosinolates altering fitness via their effects on other processes that may influence resistance to unmeasured abiotic fluctuations. We hope that this is sufficient.

2) Secondly, we all agreed that the interpretation of the analysis on the 144 field collected accessions is too opportunistic. Non-random variation of glucosinolate loci among natural accessions could be caused by fluctuating selection but there are equally valid alternatives (not mutually exclusive) such as drift, gene flow or historical population structure etc. Now the conclusions worded in the subsection headed “Non-random variation of GSL loci among field collected accessions” are much too selectively biased towards the first. We strongly suggest you rephrase the interpretation thoroughly doing just the alternatives or, and this may be the better alternative, to remove this analysis from your study altogether.

We would like to keep this analysis as we feel it nicely connects to the empirical data and provides an alternative explanation for the distribution of variation in the accessions. We have added the following material to this section in the hopes of properly caveating the data and saying that future work needs to be done to clarify any of the hypotheses about the data: “It is similarly possible that this non-random variation is caused by non-selective processes like migration, population structure and local bottlenecks. Significant future efforts will be required to test the extent to which this non-random variation is caused by neutral demographic processes vs potential fluctuating selection.” We feel that we would be remise if we did not include this data and hope that this more detailed caveating of all possible explanations for interpretation helps. If it is still felt that this is too much we will remove it.

We hope that this letter, the minor comments of the referees and the report of Dr. Turelli will help you to revise your manuscript for eLife.

Please find below detailed comments of Michael Turelli in response to the manuscript and the discussion among the referees:

This is indeed deep water. Given the technical nature of the relevant theory it should be no surprise that both the authors and the reviewers make incorrect assertions, but both make important points.

First, the relevance of the Turelli and Barton (2004) conditions to maintenance of variation involving genotypes created frommutants [involving] two or three mutated alleles (different loci) [brought] together to get the desired phenotype.The Turelli and Barton (2004) conditions involve the effects of alleles at individual loci. Hence, one cannot invoke conditions for multilocus genotypes. As I understand this is a key criticism of the reviewers (the locus effects reported here are confounded by the effects of other loci that are in disequilibrium in this experimental population). This criticism is correct, but see below.

Second, the relevance of the Turelli and Barton (2004) conditions to any genetic variants in Arabidopsis thaliana, which is predominantly selfing. Turelli and Barton (2004) explicitly assume random mating. If Arabidopsis thaliana were completely selfing, which I believe is reasonable as a first approximation, populations would be effectively composed of competing clonal genotypes (the referees mention here that Arabidopsis is less clonal than indicated by Turelli, otherwise genome-wide association studies would be impossible). In this case, the distinction between variants at one locus versus multiple loci is irrelevant, as is the distinction between diploidy and haploidy. Hence, the diploid-random-mating conditions provided by Turelli and Barton (2004) are irrelevant to the maintenance of variation. In general, temporal fluctuations alone cannot maintain variation for a haploid, the genotype with the highest geometric mean fitness will prevail (this idea goes back to Dempster 1955, cited in Turelli et al. 2001, Evolution 55:1283-1298, which explicitly deals with the maintenance of a famous flower-color polymorphism). For such populations, one must invoke either spatially varying fitness or temporal variation with a seed bank. The exact conditions for the maintenance of variation will depend on the nature of gene flow between patches with alternative selection regimes and the extent to which the seed bank creates overlapping generations. These issues are discussed by Turelli et al. (2001).

The conditions provided by Turelli and Barton (2004) cannot be directly applied to the system of Kliebenstein system. However, showing that different genotypes are favored under different conditions does indeed suggest that fluctuating selection (in time and/or space) may contribute to the maintenance of this polymorphism. The exact mathematical conditions relevant to this system have probably not been worked out. However, assuming near-complete selfing, maintenance of variation will require overlapping generations via a seed bank (I'm not sure if this is relevant to A. thaliana) and/or spatial variation with gene flow. The exact conditions will be subtle and depend on biological details that are surely not known.

We have added in a point in the Discussion section, stating that for fluctuating selection to maintain diversity in Arabidopsis would require either a seed bank and/or spatial separation of selection. We then cited papers providing evidence of both and commented on the need for more natural history to provide detailed parameters to these to allow for more detailed models to be built for this situation. We hope that this helps.

Reviewer #2 minor comments:

1) Abstract:fitness effects were significant in each environment but the pattern fluctuated such that highly fit alleles in one year displayed lower fitness in another.This experiment doesn't compare alleles, it compares multilocus genotypes.

We agree that this experiment compares different multilocus genotypes rather than alleles. The text has been changed and “alleles” has been changed to “genotypes”.

2) In the subsection headed “Environment interacts with GSL genotype to impact leaf damage”:no particular GSL genotype showed a consistent maximal or minimal level of leaf damage across the three field trials. This conclusion that no genotype has consistently highest or lowest damage level obscures the highly significant main effect of genetic differences in damage levels (Table 3).

We agree that genotype is significant but the genotype x environment term is more significant and contains more than two times the variance even though linear models are known to overweight the main effect terms at the cost of the interaction terms. This is in agreement with the visual analysis in the figures where the maximal and minimally fit genotypes are inconsistent. We think the middle is where the main effect variance is coming from. Thus, we prefer keeping our original conclusion.

3) Discussion:it has been complicated to validate that specific polymorphic loci within a pathway are the actual causative basis of any changes in fitness due to the use of polygenic populations. Unfortunately, this study shares the same shortcoming, because the chosen combination of multilocus genotypes prevents a clear test for antagonistic pleiotropy at individual loci.

We feel that this shortcoming is by no means the same scale as the existing polygenic populations that are segregating for thousands of genes some in extreme linkage (Kroymann and Mitchell-Olds, 2005, Nature 435, 95-98). In our case, we have a mere handful of discrete gene manipulations all within a single pathway that allows us to focus our interpretation in a way that was not previously possible. We have commented on how it would be optimal to have the full 256 line matrix containing all combinations of alleles between all loci to fully interrogate the effects of all loci in all possible backgrounds (please see the Discussion).

4) Discussion:we can directly conclude that it is these specific genes and their GSL phenotypes that are determining the differences in fitness in the field. True, but that is not sufficient to show that fluctuating selection maintains genetic variation at any particular locus.

We tried very carefully to re-edit the manuscript to ensure that we aren’t making specific allele claims but instead talking in general about the group of loci that we manipulated (i.e. genotypes). We hope that our significant caveats throughout this paper are sufficient to convey to the reader that we are simply providing evidence supporting a role of fluctuating selection in maintaining variance and are not arguing that we have proven this as the sole mechanism at play. As stated earlier in the letter from the editor, there are almost no studies providing similar evidence and we hope that by leaving the paper as written with it being a hypothesis that we can stimulate further research.

5) In the subsection “Statistical analysis methods”: Each plot has 10 blocks. Why is there is no block term in the ANOVA?

The blocks were subtended to the plot term as suggested in the previous round of review for the split-plot analysis and as such we did not investigate the block term.

6) Fitness is normalized with respect to performance of Col-0. This is contrary to the usual definition of relative fitness (with a mean of 1.0). Because of this normalization to Col-0, the reported effects on relative fitness may reflect behavior of Col-0 rather that population means.

We have adjusted the normalization to make relative fitness to reflect the population mean. Figures and tables have been adjusted accordingly.

The results of the mixed model analysis do not change depending on whether fitness is normalized relative to Col-0 or the population mean. However, we understand that it might be more common to see relative fitness calculated based on the population mean. We have changed accordingly and apologize for any confusion it may have caused.

7) What does Figure 8 tell us, beyond the main and interaction effects already reported in Table X?

This is a graphical representation of the data presented in the table along with the hierarchical clustering of the genotypes. Some readers approach data more graphically while others prefer the tabular form and as such we felt it was best to include both presentations.

8) In the subsection headed “Pleiotropic Links to GSL Genes”:while the GSL genes are causing pleiotropic effects, these pleiotropic effects are not driving the observed fitness consequences of the GSL genotypes in our field trials.A simple pairwise correlation analysis is not sufficient to support this conclusion. More complex ANCOVA-like models might be helpful here, perhaps with principal components of GSLs.

We feel that a simple pairwise correlation is sufficient to state that pleiotropic effects of flowering time are not sufficient to explain the entirety of the data. If flowering time was the sole driver, then there should have been a simple pairwise correlation. A more involved ANCOVA or path analysis would be required to delve even further into cause and effect but we feel that this is beyond the scope of this manuscript which is solely working to show that there can be fluctuating selection without giving the specific mechanism which as we stated in the discussion may be beyond our current measurements and require further work.

Reviewer #3 minor comments:

1) Introduction: this sentence gives the impression that intermediate frequency variants are frequent in the genome. This is misleading, because the majority of variants do have low frequency. It is just that there are several intriguing examples of intermediate frequency alleles. Furthermore, people have tried to relate FRI to fitness and failed in most cases.

We have changed this section completely to better relate the frequency of alleles topic better. We hope that this works.

2) Also in the Introduction: I wonder why the authors do not mention the work of Fournier-Level et al. 2011 which does relate underlying genes to fitness.

Thank you for this suggestion. We have added this citation and another study on local adaptation using GWA studies (Platt, 2010 and Li, 2014). We apologize for not including these previously as we were focused on citing studies using specific single locus manipulations in the field.

3) In the subsection “Leaf damage in the field varies across environments”: pesticide application is somehow unfortunate because it artificially decreases herbivory load and thus might explain why the expected effect of GLS variation on fitness via herbivore defense was not observed. This must be better incorporated in the Discussion.

We had intentionally included the pesticide treatment to enable us to test the effect of herbivory load upon the fitness effects. Unfortunately the fluctuating environments lead to most years having lower than expected herbivory load in these experiments causing the treatment to have minimal impact on the results. We have incorporated a new sentence in this section of the manuscript stating that another explanation for this lack of linkage is that the experiment was too small and needs to be larger. We hope that this is sufficient. We would prefer to not focus too much effort on the lack of an observation, as this is in all likelihood an issue of power.

4) In the subsection “Fluctuating selection estimates”: I really appreciate that the authors include now the model of Turelli and Barton, but the model and its predictions should be briefly summarized (for the reader this will come out of the blue!). More explanations are needed in the Methods as well. Did the authors use LSmeans to calculate the parameters? Were LSmeans calculated after correction of error over dispersion?

As suggested by the reviewing editor, we have removed the Turelli and Barton equation work.

5) In the subsection “Non-random variation of GSL loci among field collected accessions”: this formulation is somehow strange. The wordspopulation structureshould be mentioned so that the right bell rings for the reader.

We have better stated the set of demographic/non-selective processes that could have generated a non-random population structure in this section.

6) In the same subsection: the authors should be clearer here that this analysis cannot exclude the possibility that the skewed haplotype distribution can result from population structure only.

Please see the comment to the editor at the beginning on how we have more fully caveated this section and said that future work needs to be done in the field to assess the relative contribution of selective and non-selective processes to this distribution.

7) Discussion: this aspect of the Discussion is still too uncritical for me. The authors should not forget that their findings are not expected and may be driven by processes that were not supposed to play a role. What about insertion effects? Inserted transgenes can disrupt other genes and may result in fitness effects that are unrelated to GLS function. EMS lines initially contain thousands of mutations. They are generally removed by multiple generation of backcrossing to the wild type. Crossing adds some additional backcrossing, is it possible that lines differ in the number of linked mutations? Finally, transgenes are not always stably expressed and could be silenced in one or the other generation, especially if several tDNA insertion lines are coupled. Such problems are well known to (good) molecular biologists but often overseen by ecologists. I believe it is important to critically assess the possibility that the manipulated lines may not be doing exactly what they are supposed to do.

We have added in the following material to the Discussion: “We should also note that even with all of our efforts to clean up the respective backgrounds and validate that the mutant phenotypes are similar to the segregating natural genotypes, it remains possible that some of the observed effects are caused by unexpected changes in the lines”.

We hope that with the significant efforts at cleaning up the material as described in the Materials and methods that this is sufficient. We would like to note that we can rule out T-DNA silencing as we phenotyped every plant in the analysis using HPLC so that we can be sure that all insertions were generating the appropriate biochemical phenotype in each and every line and as such the insertions were not silenced.

8) Finally, the manuscript relies on quantitative analyses of variation. One should not dismiss the possibility that means were not correctly estimated, especially with over dispersed measurement of fitness. There is no such thing as a perfect statistical analysis.

We hope that the manuscript is now sufficiently caveated throughout to represent the potential pitfalls. We would like to note that as in the previous round of reviews that we had gone through the data to limit any effects of over-dispersion on the mean estimation.

[Editors’ note: the author responses to the previous round of peer review follow.]

1) The referees feel thatfluctuating selection/bet hedgingtheme should only (at best) appear in the Discussion and certainly not be a central theme. Much of this has to do with the analysis presented in Figure 2 using the 144 Arabidopsis accessions for drawing conclusions on natural selection as a determinant of their genotype distribution.

We have extensively rewritten the entire manuscript to focus on validating fitness effects using specific genetic manipulations and field trials. As per the suggestion by the reviewers, we have introduced a new section where we utilize the Turelli and Barton, 2004 equations to show that our empirical values for fitness on the GSL genotypes in our population fit within the range of parameters established as necessary for fluctuating selection to stabilize genetic variation as established in this paper (Turelli, 2004). We feel that this combination of rewriting in addition to testing the Turelli and Barton parameters allows the paper to show that these genes/loci affect fitness in the field and may be under fluctuating selection. We have moved Figure 2 to Figure 9 as it is now used to indicate that the natural accessions may support the Turelli and Barton model.

2) The ANCOVA analysis using the 144 accessions and its interpretation is troublesome and maybe should be removed altogether.

As suggested by the reviewer, we have removed the ANCOVA analysis.

Several of the remaining statistical analyses (Figures 4-8) should be adjusted.

We have redone all the statistics using the split-plot model structure and previous Figure 8 was removed as it is no longer appropriate. We hope that this rectifies this concern.

Survival and zeroes:

We have run all the statistics using absolute fitness with and without survivorship to show that the models have the same result and that the non-survivors are not driving the observed link between genotype and fitness. We hope that this rectifies any concerns.

Multiple testing:

We have removed previous Figure 8 as it no longer fits and all other multiple comparisons were done using either Dunnett’s or Tukey’s post-hoc tests that adjusts for multiple testing. We hope that this rectifies this concern.

3) The validity of the fitness-proxy parameters should be better justified.

We have updated this section extensively to justify the inclusion of silique length in our measure of absolute fitness. The basis of the argument is that because seed size is similar amongst all GSL genotypes and silique count is an approximation of the potential for seed production, that including silique length, which in Arabidopsis is linear to seed size, we have a better approximation of seed production in these individuals. The linearity argument arises from the fact that Arabidopsis siliques contain only two lines of seeds per silique and as such a silique is really a two-dimensional fruit with regards to seed number. Therefore, we calculated absolute fitness as total fruit count (TFC) x silique length with and without survivorship (Table 5 and Figure 7 and Figure 8A). In addition, we conducted all the statistics using TFC alone with and without survivorship and got similar results (Table 3–source data 1 and Figure 7–figure supplement 1).

Reviewer #1:

General assessment:

The field experiment is really very interesting and has delivered a valuable data set. I disagree with several aspects of the analysis pipeline and hence I do not see sufficient support for the main conclusions. The article is not easy to read since phrasing is often vague and imprecise.

My main criticisms:

1) The projection of the experimental data onto the genotype-distribution of 144 Arabidopsis accessions does not work for me:

A) I fail to see why you would expect to see evidence for natural selection within a group of plants (accessions) of which you do not know what the original selection criteria were: for sure these are not ecotypes and it is unclear to which extent they represent the genotypes of their respective original populations (Weigel, 2012, Plant Physiol: 158:2-22). This should at least be discussed in much more detail.

We have worked to address this concern in several ways. The first is that we have included a new figure showing geographic origin of the accessions (Figure 2) that shows that we are using a globally distributed collection. Secondly, we have moved this analysis of the 144 accessions and corresponding figure to the end of the Results (Figure 9) as it was only meant to be supportive of our conclusions and it became obvious from all the reviewers comments that having it near the beginning was giving it dramatically more weight than expected. We hope that this helps to clarify the meaning of this analysis by placing it at the end of the manuscript. We would also like to note that we did not use the term ecotype but instead accession for the very reason mentioned by the reviewer.

B) The analysis presented in Figure 2 that justifies the field experiment (in the subsection headed “Structured population mimicking natural GSL variation in Arabidopsis”) is not strong since a test for the goodness of fit of the overall observed versus expected relative frequency distributions is missing (e.g. chi square test for goodness of fit).

We have now given the goodness of fit for the overall model as requested, which was highly significant (p value <<< 0.001).

C) How does the analysis of the Figure 2 data exclude non-selective processes as an alternative for random assortment (I think neither of these are mutually exclusive).

As stated above, we have moved this figure to the end of the manuscript and made it clearer that this is not proof of selective processes but does support our conclusions. We also discuss that more field trials over successive years as well as more extensive accession sampling would be necessary to fully validate a fluctuating selection model.

D) You writefield studies confirm lab results(at the end of the subsection “GSL genetic variation controls GSL profile in the field”), but you do not provide a direct comparative analysis, just a 'visual' interpretation of the data.

We have now included a direct comparative analysis using PCA and correlation to show that the GSL profiles of the lab and field grown plants are highly comparable.

E) In the subsection “Empirical fluctuating measures of selection in the field predict standing variation in GSL genes”, you conclude thatselection likely played a role, but you do not explain where in the figures we can see this. Looking at the three plots (Figure 9), I only see that the correlation between frequency and your relative fitness-proxy is weak (probably due to UWY2012).

We agree that the Figure 9 and the corresponding ANCOVA does not strongly support our conclusions and has been removed from the manuscript. It was replaced with calculations of the Turelli and Barton, 2004 fluctuating selection model parameters. We hope that this has addressed the concerns.

2) You used adeterministicapproach to genotype the 144 accessions i.e. on the basis of theirGSL profilesbut a validation of this approach is missing:

A) Throughout the manuscript is remains unclear what is meant withprofileand how these were evaluated (e.g. which grouping criteria/procedures/assignment to a genotype). Hence its validity cannot be assessed.

We have worked to better explain that profile is the mixture of GSLs within a genotype and their relative abundance. To support this description, we have included new figure supplements to Figure 1 that provide representative HPLC trace outputs for each GSL genotype which is a direct visual of what the GSL profile looks like to better illustrate this concept within this system. The way profile is defined has been expanded in the text. This was clearly an oversight and we apologize for the confusion. We have explicitly provided the rules for calling these profiles and their associated haplotypes in Figure 9–source data 1.

B) In the subsection “GSL genetic variation controls GSL profile in the field”, it is unclear to which extent actual profile information was used for the downstream analyses or when only the information on the total (aliphatic) glucosinolates was used (e.g. see Figures 4 and 5). Figure 4 appears to represent theseprofilesbut a statistical evaluation is not provided.

We have included statistical analysis on the profile figures as requested and provided visuals on the profiles in Figure 1. Additionally we have worked to make it explicit when we are talking about profile vs the total of aliphatic glucosinolates.

C) Why was this indirect approach for genotyping preferred over a direct (DNA-based) approach?

As is true for other naturally variable loci in Arabidopsis, such as flowering time, there are independently evolved alleles in the GSL pathway that generate the same phenotype but appear as independent genotypes. Thus, we prefer to define the accessions by their functionality at each locus which better reflects the GSL profile differences observed. Additionally, the GSL loci have local rearrangements that make them nearly impossible to obtain accurate genotypic information using Illumina resequencing approaches and as such we have found that the data in the 1001 databases for these loci is inaccurate when we compare to direct BAC sequencing of the same accession. Thus, the available data is unable to provide accurate genotyping information which is another reason to use the indirect approach at this time.

3) The factors of the statistical analyses are often unclear:

A) The factorlocationis misleading: it should beenvironmentsince the different locations were used at different moments in time. Why do you refer to these differences as 'fluctuation' i.e. how do you know they are not caused solely by differences in starting conditions? Now you infer these differences from Figure 5 but the patterns across the three panels should be statistically evaluated to decide to which extend they differ.

We have changed the term “location” to “environment” as requested. We worked to minimize the differences in starting conditions by working to make the planting time as close as possible. Similarly, all plants were started in a greenhouse to minimize variation in starting conditions on the seedlings. As with any field trial it is very difficult to ascribe specific causes of fitness differences and we have worked to show that flowering time which should be equally susceptible to starting conditions was not linked to the resulting fitness. Even if the planting data was identical, the starting condition across years could be strikingly different and we interpret that as a fluctuation. We have included statistical analysis within Figure 5 and Table 3 to show the differences.

B) The statisticalinteractionis misinterpreted and dance around its meaning. Significant interactions indicate that their simultaneous effects are not additive i.e. either the combined effect is greater (synergistic) or smaller (antagonistic) than expected (additive) effect. Pinpointing what they mean is sometimes virtually impossible and requires post hoc statistics.

We have included a new set of figures on both herbivory and fitness to show that neither the synergistic nor antagonistic models fit the observed genotype × environment interactions as there are numerous instances of the GSL genotypes crossing, i.e. a genotype’s rank for fitness changing from environment to environment. We hope that this visual analysis and accompanying changes to the statistical model help to show that this fits neither the synergistic or antagonistic model but instead is a better fit to a fluctuating model.

C) The ANCOVA procedure is not explained anywhere.

The ANCOVA and the corresponding figure have been removed as per requested by other reviewers.

4) Your fitness proxies, and especially how they were normalized, need validation. In “GSL variation impacts fitness in the field”, you describe a normalization procedure (which assumes a linear relationship between silique length and fecundity) which struck me as highly arbitrary. This procedure needs references or a solid validation.

We have built our fecundity procedures on the cited references from Schmitt and colleagues that are inherently built on a linear assumption of silique number and fecundity. At its heart, this makes an implicit assumption that silique number and seed set are directly related. We however found that silique length was genetically determined and as such this implicit assumption may be incorrect. In Arabidopsis, the silique is a linear fruit with two files of seeds. We found that seed size was not different amongst genotypes and thus silique length is directly correlated to the number of seeds that can be within that silique. Thus as we had genetic variation in silique length we had to adjust the approximation of Schmitt and colleagues to appropriately reflect this fact. We have worked to make this clearer within the text. In addition, we conducted all the statistics using total fruit count alone with and without survivorship and got similar results as we did for absolute fitness (Table 3–source data 1 and Figure 7–figure supplement 1).

Reviewer #3:

Statistics:

1) This is a split plot design (for insecticide treatment), not a randomized complete blocks design. Correction of this analysis will certainly affect inferences related to the insecticide treatment, and may alter other parts of the model.

We have redone the entire analysis with a split plot analysis. All statistics throughout the manuscript have been adjusted to reflect this new analysis. This as suggested by the reviewer lead to a loss of significance of the treatment effect which required us to rework the entire manuscript to remove any inference about the effects of herbivory upon the analysis as there was no difference in the pesticide and control plots. The observation that GSL gene variation and an interaction of GSL gene variation and environment are linked to fitness was still significant.

2) Inclusion of individuals with zero fitness is not compatible with the distributional assumptions of this ANOVA. This should be visible in the residuals, although no statement is made regarding such verification of statistical assumptions. Such zero-inflated data pose a difficult problem in such analyses, which typically alter levels of statistical significance and often cause spurious rejection of the null hypothesis.

We have empirically tested if the zero fitness individuals were causing spurious significance by running the model using fitness as calculated with the zeros and without the zero individuals. This is now shown in the manuscript in Table 5 where we show that both models lead to the same interpretation of the data. The residuals are also not significantly affected by the zeros because they are relatively infrequent in comparison to the non-zeros. Throughout the manuscript we have presented analysis using fitness with and without survivorship to allow the reader to specifically compare the two results so that they can confirm our conclusions are not driven by survivorship.

3) Controls for multiple statistical tests are needed at several points in this manuscript. Examples include tests for non-random distribution of multilocus GSL genotypes (Table 2) comparison of treatment effects, and variation of GSL among sites (Figure 5).

As per the request from Reviewer 1, we now include the whole-model Chi-square goodness of fit for Figure 9 (old Figure 2) and show that the whole model is highly non-significant. The remaining p values in this table are simply to provide indications of which genotypes are more or less deviating within that model. For the other figures, we apologize that we had not noted that we had conducted various post-hoc tests, such as Tukey’s or Dunnett’s to account for multiple correction. This has now been clarified in all instances. Because there was no significant treatment effect, we have removed the comparison from the manuscript.

Evolution and Genetics:

4) The evolutionary significance of this study is justified as a test whether fluctuating selection maintains genetic variation within and among populations. However, these experiments do not estimate herbivory or fitness for the individual GSL loci, and they do not show change in rank fitness at putatively selected loci, which is a necessary condition for balancing selection to maintain non-neutral genetic polymorphism.

We have included a new figure showing that the ranks of the GSL genotypes are changing across the environments (Figure 7). Additionally, we have added a new section where we utilize the Turelli and Barton model to estimate the per locus vi and the population K to show that the empirical values we have found are sufficient to fit the necessary conditions as modeled by Turelli and Barton to allow for fluctuating selection to maintain multi-locus genetic polymorphisms. Additionally, we have reworked the entire manuscript to focus more on the fact that the GSL genes affect field fitness. We hope that this helps to address these concerns as well as makes for a better manuscript as a whole.

5) Even if the patterns in Figure 8 were due to selection, they might be attributable to non-equilibrium directional selection in Eurasia, rather than to historical balancing (fluctuating) selection. Consequently, this analysis cannot prove that such patterns are due to fluctuating pressures maintaining standing natural variation within a species.

We agree with the reviewer that proof of how historical selection has occurred is nearly impossible to obtain. Throughout the manuscript we had attempted to state that we have generated data that supports a fluctuating hypothesis but that more extensive field trials and more extensive inter and intra population sampling within the species range are required to further validate this hypothesis. We have worked through the manuscript to ensure that we do not state we have generated proof but instead simply support.

6) The observation that multi-locus genetic variation controlling aliphatic GSL appears to be non-randomly distributed among the natural accessions is interpreted as evidence for natural selection. However, this may also result from population structure, nonrandom geographic sampling, finite population size, or failure to correct for multiple tests.

We apologize for the obvious confusion on this section and figure which we had meant to state that the pattern may result from natural selection, population structure, sampling or any other demographic process and then to state that separating between these options needed field tests of specific genetic variants. Instead of having this analysis at the front, we have moved this to the last figure (Figure 9) to indicate that the natural variation in GSL chemotype observed among Arabidopsis thaliana accessions agree with the field trial estimates of the Turelli and Barton parameters that support the hypothesis that fluctuating selection can balance multi-locus polygenic genetic variation within the aliphatic GSL pathway.

7)We observed significant variation in silique lengths across genotypesHow do GSL polymorphisms alter silique length? And, how do these effects differ among pesticide treatments? Alternatively, rather than the effects of GSL polymorphisms, variation in silique length (and fitness) may be due to position effects of the transgene inserts, or untagged Agrobacterium hits, or linked mutations not eliminated following EMS.

We have worked to make it clearer in the text that the genetic backgrounds of our GSL genotype are unlikely to have second site insertions due to the fact that they have been back-crossed multiple times to eliminate unlinked effects. Additionally, the majority of lines are specific insertions within the GSL gene to create the natural knockout allele. Further, in reply to Reviewer 2, we have noted that our previous work has shown that unexpected links to non-GSL phenotypes, such as flowering time, have been validated both with the transgenic and natural populations i.e. we could map circadian timing variation to GSL loci in an Arabidopsis RIL populations and then show the effect using the transgenic validation lines. This was with both enzymatic loci and transcription factor loci showing that this is not a second site effect, but instead a link from GSL to the clock and flowering. We wish we could provide more mechanistic insight into these non-traditional roles of GSLs but these experiments are still underway. We have included a new section showing that these pleiotropic effects on flowering and indole glucosinolate are not correlated with fitness showing that the GSL to fitness link is not an artifact of some secondary indirect random pleiotropy as best we can measure.

8) Several points weaken the proposed role of herbivory in shaping the observed patterns in GSL polymorphisms:

We have largely removed the herbivory arguments because the split plot analysis removed any significance associated with treatment meaning that we don’t have the capacity to support this argument.

8A) In Figure 6 we see that the ranking of herbivore damage (without insecticide) decreases from UWY2012 > UCD2012 > UWY2011. However, the correlations in Figure 12 show the opposite pattern: UWY2012 < UCD2012 < UWY2011. This seems incompatible with the conclusion that the herbivory differences in the field reflect natural selection by herbivores in Eurasian Arabidopsis over thousands of generations.

We have removed the ANCOVA as requested as well as the corresponding figure.

8B) Similarly, the authors note thatthe positive correlation between observed genotype frequency and fitness disappeared in the high herbivory WY2012 field trial.This suggests that these results may not be due to herbivory.

We have removed the ANCOVA as requested as well as the corresponding figure.

8C) To test for herbivory-mediated effects, one could ask whether the ANCOVA is significant if only the no-pesticide treatment is analyzed. Or, what happens to the patterns in Figure 8 if the change in fitness between insecticide treatments is used as the response variable for the ANCOVA?

We have removed the ANCOVA as requested by the reviewers.

https://doi.org/10.7554/eLife.05604.045

Article and author information

Author details

  1. Rachel Kerwin

    Department of Plant Sciences, University of California, Davis, Davis, United States
    Contribution
    RK, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article, Contributed unpublished essential data or reagents
    Competing interests
    No competing interests declared.
  2. Julie Feusier

    1. Department of Plant Sciences, University of California, Davis, Davis, United States
    2. Department of Genetics, University of Utah, Salt Lake City, United States
    Contribution
    JF, Acquisition of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  3. Jason Corwin

    Department of Plant Sciences, University of California, Davis, Davis, United States
    Contribution
    JC, Analysis and interpretation of data, Drafting or revising the article, Contributed unpublished essential data or reagents
    Competing interests
    No competing interests declared.
  4. Matthew Rubin

    Department of Botany, University of Wyoming, Laramie, United States
    Contribution
    MR, Conception and design, Acquisition of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  5. Catherine Lin

    Department of Plant Sciences, University of California, Davis, Davis, United States
    Contribution
    CL, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  6. Alise Muok

    1. Department of Plant Sciences, University of California, Davis, Davis, United States
    2. Department of Biochemistry, Cornell University, Ithaca, United States
    Contribution
    AM, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  7. Brandon Larson

    1. Department of Plant Sciences, University of California, Davis, Davis, United States
    2. US Department of Agriculture Plant Soil and Nutrition Research Unit, Cornell University, Ithaca, United States
    3. Boyce Thompson Institute for Plant Research Sciences, Faculty of Science, Cornell University, Ithaca, United States
    Contribution
    BL, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  8. Baohua Li

    Department of Plant Sciences, University of California, Davis, Davis, United States
    Contribution
    BL, Acquisition of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  9. Bindu Joseph

    Department of Plant Sciences, University of California, Davis, Davis, United States
    Contribution
    BJ, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  10. Marta Francisco

    1. Department of Plant Sciences, University of California, Davis, Davis, United States
    2. Misión Biológica de Galicia, Pontevedra, Spain
    Contribution
    MF, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  11. Daniel Copeland

    Department of Plant Sciences, University of California, Davis, Davis, United States
    Contribution
    DC, Acquisition of data, Drafting or revising the article
    Competing interests
    No competing interests declared.
  12. Cynthia Weinig

    Department of Genetics, University of Utah, Salt Lake City, United States
    Contribution
    CW, Analysis and interpretation of data, Drafting or revising the article, Contributed unpublished essential data or reagents
    Competing interests
    No competing interests declared.
  13. Daniel J Kliebenstein

    1. Department of Plant Sciences, University of California, Davis, Davis, United States
    2. DynaMo Centre of Excellence, Department of Plant and Environmental Sciences, Faculty of Science, University of Copenhagen, Copenhagen, Denmark
    Contribution
    DJK, Conception and design, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    kliebenstein@ucdavis.edu
    Competing interests
    DJK, Reviewing editor, eLife.
    ORCID icon 0000-0001-5759-3175

Funding

National Science Foundation (NSF) (DGE 0653984)

  • Rachel Kerwin

National Science Foundation (NSF) (DBI 0820580)

  • Julie Feusier
  • Jason Corwin
  • Catherine Lin
  • Alise Muok
  • Brandon Larson
  • Baohua Li
  • Bindu Joseph
  • Daniel Copeland
  • Daniel J Kliebenstein

National Science Foundation (NSF) (MCB 1330337)

  • Julie Feusier
  • Jason Corwin
  • Catherine Lin
  • Alise Muok
  • Brandon Larson
  • Baohua Li
  • Bindu Joseph
  • Daniel Copeland
  • Daniel J Kliebenstein

Danish National Research Foundation (DNRF99)

  • Daniel J Kliebenstein

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Carlos Quiros, Ute Wittstock and Bjarne G Hansen for their generous donations of seed stocks and members of the Kliebenstein and Weinig labs for assistance in the field.

Reviewing Editor

  1. Merijn R Kant, Reviewing Editor, University of Amsterdam, Netherlands

Publication history

  1. Received: November 13, 2014
  2. Accepted: March 18, 2015
  3. Version of Record published: April 13, 2015 (version 1)

Copyright

© 2015, Kerwin et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,831
    Page views
  • 406
    Downloads
  • 17
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Comments

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)