1. Genomics and Evolutionary Biology
Download icon

Spontaneous mutations and the origin and maintenance of quantitative genetic variation

  1. Wen Huang
  2. Richard F Lyman
  3. Rachel A Lyman
  4. Mary Anna Carbone
  5. Susan T Harbison
  6. Michael M Magwire
  7. Trudy FC Mackay Is a corresponding author
  1. North Carolina State University, United States
  2. Knox College, United States
Research Article
Cited
1
Views
1,839
Comments
0
Cite as: eLife 2016;5:e14625 doi: 10.7554/eLife.14625

Abstract

Mutation and natural selection shape the genetic variation in natural populations. Here, we directly estimated the spontaneous mutation rate by sequencing new Drosophila mutation accumulation lines maintained with minimal natural selection. We inferred strong stabilizing natural selection on quantitative traits because genetic variation among wild-derived inbred lines was much lower than predicted from a neutral model and the mutational effects were much larger than allelic effects of standing polymorphisms. Stabilizing selection could act directly on the traits, or indirectly from pleiotropic effects on fitness. However, our data are not consistent with simple models of mutation-stabilizing selection balance; therefore, further empirical work is needed to assess the balance of evolutionary forces responsible for quantitative genetic variation.

https://doi.org/10.7554/eLife.14625.001

eLife digest

A key challenge in evolutionary biology is to understand how genetic variation – differences in the DNA of individuals in a population – is generated and maintained to create the enormous diversity that exists in nature. Mutations to the DNA introduce new variation, but these are constantly removed from populations by two other evolutionary forces: natural selection and genetic drift.

Natural selection removes harmful genetic mutations that affect an organism’s fitness and reproduction, and genetic drift is the random increase in, or loss of, a genetic variant from a population over time. However, disentangling the effects of these evolutionary forces is challenging because the genetic variation we observe is often the final product of a long history of interaction between them.

Huang et al. have now investigated genetic variation by breeding fruit flies in the laboratory. Natural selection was minimized for these flies; genetic drift was therefore the main force that removed variation.

Huang et al. then sequenced the DNA of the flies to estimate the rate at which genetic mutations spontaneously occur. The sequences contained many more “high-impact” mutations (which directly affect how proteins in the fly’s cells work) than seen in sequences taken from a natural fly population.

Traits that are produced by the cumulative actions of many genes and the environment are known as quantitative traits. By examining how much variation genetic mutations introduced into the quantitative traits of each generation of the laboratory-grown flies, Huang et al. estimated how much variation should occur in a natural population whose quantitative traits evolved without natural selection. This estimate was much higher than the levels of genetic variation seen in nature, suggesting that natural selection acts to eliminate mutations that significantly affect quantitative traits.

Simple theoretical models cannot explain the relatively high spontaneous mutation rate and low genetic variation seen in the quantitative traits of natural populations. Therefore, further work is now required to understand more about the balance of evolutionary forces that maintain quantitative genetic variation.

https://doi.org/10.7554/eLife.14625.002

Introduction

Mutation is the ultimate source of genetic variation. In natural populations of finite size, however, mutations and the genetic variation they introduce are constantly removed by genetic drift and by purifying and stabilizing natural selection. Therefore, to understand how genetic variation is generated and maintained, one must know the rate by which spontaneous mutations occur and their effects on fitness and quantitative traits, and the forms and consequences of natural selection on genetic variation. To answer these fundamental questions, we take advantage of mutation accumulation (MA) lines of Drosophila melanogaster derived from a single inbred genome and independently maintained under conditions of minimal natural selection, as well as a panel of wild-derived inbred lines representing a balanced state of mutations, drift and natural selection. The expectation is that MA lines are experiencing severe genetic bottlenecks thus have minimal natural selection. Therefore, unless a mutation is highly deleterious or lethal, its fate is determined primarily by genetic drift. This property of MA lines is distinctly different from natural populations and allows us to derive the expectation of genetic variation under neutrality using the estimated mutational variance among the MA lines. By comparing the expectation under neutrality to the observed level of genetic variation in a natural population, one can infer the form and consequences of natural selection on genetic variation (Figure 1).

Experimental design of the DGRP and mutation accumulation (MA) lines.

(a) The DGRP is a collection of inbred strains derived from the wild population in Raleigh, NC. The spectrum of mutations and genetic variation in the DGRP is a reflection of the combined effects of mutations, drift, and selection. In this study, we used the DGRP genotype data (Huang et al., 2014) to estimate the parameter i=1GVar(gi) (see Materials and methods), the microarray expression data collected by (Ayroles et al., 2009) and organismal phenotypes (sleep traits from Harbison et al., 2009) to estimate genetic variance among the DGRP lines (Vg). (b) We derived 25 MA lines from the inbred line DGRP_360. These lines were kept with small population sizes of 10 females and 10 males such that there is minimal natural selection. At generation 60, we sequenced the MA lines to estimate mutation rate (μ) and i=1MVar(mi), and obtained microarray gene expression and organismal phenotypic data to estimate mutational variance (Vm).

https://doi.org/10.7554/eLife.14625.003

We use whole genome sequencing to determine the rate and characteristics of spontaneous mutations, and quantitative measurements of gene expression and organismal traits in the MA and wild-derived lines to understand the origin and maintenance of quantitative genetic variation. Previous MA studies have successfully used similar strategies to identify mutations and estimate mutation rates in yeast (Lynch et al., 2008), algae (Ness et al., 2012), nematodes (Denver et al., 2004), flies (Haag-Liautard et al., 2007; Keightley et al., 2009; Schrider et al., 2013), and plants (Ossowski et al., 2010); and to characterize the nature of natural selection on gene expression (Denver et al., 2005; Rifkin et al., 2005). However, compared to the present study, they were small in scale, did not simultaneously identify mutations at the DNA level and estimate mutational variance for gene expression and organismal quantitative traits, or did not compare mutational variation to standing genetic variation in the same equilibrium population from which the MA lines were derived.

Results

Identifying spontaneous mutations

To accumulate spontaneous mutations, we split one sequenced inbred line from the Drosophila melanogaster Genetic Reference Panel (DGRP) (Mackay et al., 2012) into 25 MA lines, and maintained them in small populations (10 females and 10 males) for many generations (Figure 1b). The small population size will minimize natural selection and allow non-lethal mutations to drift to high frequency or fixation and accumulate over time. We sequenced all 25 MA lines at generation 60 and obtained deep sequence data for 23 lines (Supplementary file 1A). We detected a total of 1,456 mutations that were either fixed or segregating at high frequency (>0.20) on the euchromatic nuclear genome in the MA lines. The number of mutations per MA line ranged from 35 to 193, with a mean of 63 (Supplementary file 1B). To validate mutations detected by Illumina’s sequence-by-synthesis chemistry, we sequenced a randomly selected set of 51 mutations in five MA lines at generation 130 by capillary sequencing (Sanger). We reasoned that the 70 generations between mutation detection and validation allowed sufficient time for mutations that were segregating at generation 60 to drift to fixation, and hence be more reliably detected by Sanger sequencing. The observed fixation probabilities of putative mutations agreed well with the expected probabilities given their initial frequencies at G60 (Figure 2), suggesting that both the mutations and their frequency estimates were accurate.

Validation of mutations.

51 Mutations detected at G60 (frequency > 0.2) in MA lines 2, 11, 13, 17, or 24 were randomly selected and sequenced by Sanger sequencing at G130 and classified as Fixed (cross) or Lost (circle) by manually inspecting chromatograms. The fixation status is plotted against the initial mutant frequency at G60. A LOESS smooth line was fitted to the data (point estimates = solid line, 95% confidence interval = broken line) to estimate the fixation probability. Expectation of fixation probability (= initial mutant allele frequency) is indicated by the grey diagonal line.

https://doi.org/10.7554/eLife.14625.004

Genomic properties of spontaneous mutations

Among the 1456 mutations, there were 1203 single base substitutions (SBS), 17 multiple base substitutions (MBS), 141 deletions, 47 insertions, and 48 complex mutational events that involved a combination of base substitutions and indels (Supplementary file 1B). The numbers of different types of mutations are proportional to their mutation rates; thus the indel mutation rate was approximately 1/6 that of SBS, and deletions occurred at a frequency three times higher than insertions. Of the 1203 SBS, 595 were transitions (Ti) and 608 were transversions (Tv), corresponding to a Ti/Tv ratio of 1.96 (Figure 3). There were twice as many base substitution mutations at G or C sites (n = 828) than at A or T sites (n = 401), which, given the 42.48% genomic GC content, indicated a strong bias (~ 2.80 fold increase) towards mutations at G or C bases. These numbers agreed well with a previous study using the same sequencing strategy of three D. melanogaster MA lines, which reported a Ti/Tv ratio of 1.95 and an approximately two-fold increase in mutations at G/C bases (Keightley et al., 2009). We then asked if genomic context affected whether or not a mutation occurred. While there was no appreciable difference in local sequence (20 bp up and downstream) GC content between mutations and randomly sampled sites, indels appeared to occur more often in low complexity regions, where homopolymers and short tandem repeats can often be found and are prone to replication slippage (Figure 4).

Classification of single base substitutions (SBS).

Single base substitutions were classified according to their ancestral alleles (top and bottom box) and mutant alleles (middle boxes). The size of each of the middle boxes indicates number of mutations in each class. Blue and red lines indicate transitions and transversions, respectively.

https://doi.org/10.7554/eLife.14625.005
Genomic context of mutations.

Sequence composition (GC content, a) and complexity (b) for local sequences (20 bp up and downstream of mutations) are plotted as box plots and compared between different types of mutations. The 'Random genomic' class contains 1000 randomly chosen sites in the genome. Sequence complexity is measured as –lnS, where S is calculated using the algorithm in NCBI’s DUST program and measures sequence complexity. P values were computed by Wilcoxon’s rank sum tests comparing data in each category to the 'Random genomic' category.

https://doi.org/10.7554/eLife.14625.006

To characterize the functional effects of spontaneous mutations, we annotated their genomic locations (exonic, intronic or intergenic) and compared the distribution to that of standing variation in molecular polymorphisms in the DGRP, and to the fraction of total genomic sites in each of these categories. Standing variation in the DGRP reflects the demographic history of the natural Raleigh, NC population and is depleted of exonic variants (Figure 5a). However, the extent of depletion of exonic spontaneous mutations was much weaker – the spectrum of spontaneous mutations detected in the MA lines more closely reflected the proportion of exonic, intronic and intergenic bases in the genome (Figure 5a). We further classified coding variants and mutations according to their functional impacts on polypeptide sequences. Remarkably, the proportions of frame-shift, stop gained or lost, and nonsynonymous spontaneous mutations were significantly greater than the proportion of segregating polymorphisms in the DGRP for these categories (Figure 5b). While the fitness effects of these protein sequence mutations may differ under laboratory and natural environments, these results clearly indicate that deleterious mutations that would otherwise be lost have accumulated in the MA lines. In keeping with the inference that spontaneous mutations accumulated under minimal natural selection, the numbers of mutations per gene was primarily a function of gene length (Figure 6); and there was no gene ontology (GO) category enrichment for genes harboring new mutations (Supplementary file 1C).

Mutation accumulation lines accumulate deleterious mutations.

Annotation of genomic bases, standing variation, and mutations in MA lines according to their (a) genomic locations and (b) functional impact on protein sequence.

https://doi.org/10.7554/eLife.14625.007
Relationship between number of mutations and gene length.

Number of mutations detected in MA lines for each gene is plotted against the total number of bases covered for mutations.

https://doi.org/10.7554/eLife.14625.008

Spontaneous mutation rate

To estimate the spontaneous mutation rate, we first inferred the effective population size in the MA lines based on the observed mutation frequency spectrum of presumably neutral mutations. This was achieved by a maximum likelihood procedure where the probability density was obtained by simulation. The parameter Ne was estimated by taking the value that maximizes the likelihood of observing the data given the probability density. Although the census population size of each MA line was 2Ne=40 haploid genomes, the effective population size was estimated to be 2Ne=19 for the X chromosome and 2Ne=23 for autosomes (Figure 7a,b). The frequency spectrum of the observed effective population size visually agreed with the expectation (Figure 7a,b). At this population size, for selection coefficients s ranging between -0.01 and 0.01 and assuming additivity, the fixation probability u=1e2s1e4sNe (Kimura, 1957) is between 0.034 and 0.054, which closely centers around the fixation probability for neutral sites. The large difference between census and effective population sizes could be due to the large variance in number of offspring commonly observed in flies (Crow and Morton, 1955). Although the X to autosome Ne ratio was slightly higher than the expected 0.75, this difference was not statistically significant (P = 0.40). The estimates of Ne did not differ whether or not presumably non-neutral mutations (mutations that changed amino acids) were included, further suggesting that drift was the primary force driving mutation frequency dynamics in these MA lines. Given the estimated effective population size, the marginal (integrated over 60 generations) probability of a mutation to attain the frequency cutoff (0.2) was 7.20% on the X chromosome and 6.18% on autosomes. We used these estimates of effective population size to estimate the spontaneous mutation rate on the X chromosome and autosomes, given the number of observed mutations in each line, taking into account variable sequence coverage in each line.

Inference of effective population size and mutation rate.

Expected and observed distributions of mutant allele frequency on autosomes (a) and the X chromosome (b). The expected distribution was generated based on estimates of effective population size (2Ne). Mutation rate estimates of autosomes and X chromosomes are compared for different types of mutations (c) and for different lines (d). In (d), the Pearson’s correlation coefficient (r) is calculated with (orange line) or without (green line) MA19 (orange circle).

https://doi.org/10.7554/eLife.14625.009

The median spontaneous mutation rate was 5.21 × 10–9 per base on autosomes and 5.07 × 10–9 on the X chromosome for single base substitutions and 0.79 × 10–9 on autosomes and 0.65 × 10–9 on the X chromosome for indels. The overall spontaneous mutation rate for all types of mutations combined was 6.25 × 10–9 on autosomes and 6.96 × 10–9 on the X chromosome (Figure 7c), similar to recent mutation rate estimates from MA studies using high throughput sequencing (Keightley et al., 2009; Schrider et al., 2013). There was substantial variation in spontaneous mutation rates among the MA lines (Figure 7c). The mutation rate in MA19 was nearly a magnitude greater than the lines with the smallest mutation rates (MA08 for autosomes and MA15 for the X chromosome). Such a large difference cannot be solely explained by variability in 2Ne among the MA lines, as varying 2Ne from 10 to 40 can only account for a 28% difference in mutation rate (difference in 2Ne*p, see Materials and methods). The mutation rates on autosomes and the X chromosome showed no systematic difference (Figure 7c, paired Wilcoxon rank sum test P = 0.64) and were positively correlated (Figure 7d), further suggesting the random distribution of mutations.

Rate of introduction of genetic variation by spontaneous mutations

To assess the effects of mutations on quantitative traits, specifically the rate at which mutations introduce genetic variation, we analyzed the genetic variation of genome-wide gene expression profiles, two bristle number traits, and five sleep and activity traits. We partitioned the phenotypic variation observed among the collective sample of individuals from the MA lines into the between-line genetic variation due to mutations (VMA) and the within-line variation due to environmental or technical noise (Ve). Although any pair of MA lines only differ on average by 126 sites, all organismal phenotypes and the expression of a large fraction of genes in both sexes (1526/7566 = 20.2% of expressed genes in females and 3,872/8,136 = 47.6% in males) accumulated significant between-line variance (Supplementary file 1D–E, Figure 8). Mutational variance (Vm), the amount of genetic variation introduced by mutations in each generation, scaled by environmental variance and expressed as the mutational heritability (hm2= Vm/Ve), had a median of 0.55 x 10–3 for gene expression traits in females and 0.75 x 10–3 in males (Figure 8). hm2 for organismal traits ranged between 0.30 x 10–3 and 2.88 x 10–3, of the same order of magnitude as observed in earlier studies (Houle et al., 1996). These values were near or among the upper quartile of that for gene expression traits with the exception of abdominal bristle number in males (Supplementary file 1D). The difference in mutational heritabilities between gene expression traits and organismal phenotypes may be due to a larger number of QTLs influencing organismal traits and thus a larger mutational target size.

High rate of mutational variance.

Histogram of mutational heritability (hm2=Vm/Ve) for females (a) and males (b) are plotted on the log scale. The placements of organismal traits in the hm2 bins are indicated by lines connecting the bars and the trait names. AB = abdominal bristle and SB = sternopleural bristle.

https://doi.org/10.7554/eLife.14625.010

We next assessed whether genes exhibiting mutational variation for gene expression were associated with GO categories. For each GO category, we compared the mutational variance of genes in the category to those not in the category. For significant GO categories, we further polarized the difference in rate of accumulation of mutational variance as faster or slower within the category. Although mutations occurred randomly across the genome and were not associated with GO categories, the rate of accumulation of genetic variation was significantly different for genes within many GO terms than the rest of the transcriptome (Supplementary file 1F). For example, in both sexes, mutational variance accumulated faster for genes involved in chitin metabolism, iron binding, and sensory perception of chemical stimulus, but slower for genes involved in protein translation, mRNA splicing, and mitotic spindle organization. Finally, we compared mutational variation in gene expression to plasticity of gene expression across a wide range of environments (Zhou et al., 2012). We observed that genes that are more plastic to macro-environmental perturbations accumulated genetic variation at a faster rate (Figure 9), suggesting a shared control of gene expression variation by mutations and environmental perturbations (Landry et al., 2007).

Correlation between mutational variance and environmental plasticity.

Mutational variance (Vm) is plotted against variance due to environmental plasticity (VENV) for females (a) and males (b). The dashed line is a LOESS fit to the data. Spearman’s correlation (ρ) and the P value of a test for its significance are also indicated.

https://doi.org/10.7554/eLife.14625.011

Strong stabilizing selection on quantitative trait variation

Under the neutral model of polygenic phenotypic evolution, the among-line variance (Vg) in the DGRP inbred lines is 4NVm, where N is the effective size of the wild population from which DGRP was derived (Lynch and Hill, 1986). We estimated the genome-wide nucleotide diversity in the DGRP (π) to be π = 4.92 x 10–3 in the DGRP (Huang et al., 2014), from which we estimated N to be 186,363 (π4μ). For all organismal and the majority of gene expression traits, the magnitude of standing genetic variation in the DGRP was much smaller than that predicted under the assumption of neutrality (Figure 10). Therefore, there must be strong stabilizing selection that acts either directly on the traits (direct stabilizing selection) or through pleiotropic fitness effects of new mutations (apparent stabilizing selection), which constrains the accumulation of genetic variation by mutations for gene expression and organismal traits (Denver et al., 2005; Rifkin et al., 2005). To understand the consequence of the strong apparent stabilizing selection, we estimated the expectation of allelic effects based on the observed sequence and quantitative trait variation among the MA and DGRP lines. The amount of genetic variation is proportional to sequence variation by a factor of E(a2) (see Materials and methods), where a is the allelic effect of a mutation on quantitative traits. For the majority of traits, the ratio of mutational to standing genetic variance, Vm/Vg, far exceeded the expectation given the observed sequence variation (Figure 10); thus the allelic effects of spontaneous mutations, E(am2), were several orders of magnitude larger than that of standing DNA variation, E(ag2). This result suggests that the apparent stabilizing selection, directly for fitness and indirectly for traits correlated with fitness, had either eliminated mutations with large effects on quantitative traits or modified their effects. The former appears to be at least partly true given the obvious difference in functional categorization of spontaneous mutations and standing DNA variation (Figure 5).

Strong apparent stabilizing selection of quantitative trait variation.

Distributions of Vm/Vg in females (a) and males (b) are plotted on the log scale. The blue, red, and purple bars indicate genes with significant (FDR = 0.05) among-line variance in DGRP only, MA lines only, and both DGRP and MA lines respectively. Placements of organismal traits are indicated by lines connecting the bars and the trait names. AB = abdominal bristles and SB = sternopleural bristles. Neutral expectations 1/4N and Var(mi)/kVar(gi) are also indicated.

https://doi.org/10.7554/eLife.14625.012

Using Vm/Vg as an indicator of the strength of the apparent stabilizing selection (higher values = stronger selection), we examined the properties of genes associated with variation in Vm/Vg for gene expression. First, genes expressed in both sexes are under slightly stronger selection than those expressed in only one sex, and the strength of selection has a modest but highly significant positive correlation between the two sexes (Figure 11). Second, there is stronger selection for genes on the X chromosome than autosomal genes, and this effect is more pronounced in males than females (Figure 12). Third, we partitioned the genome with respect to GO categories and assessed the significance of the difference of Vm/Vg for genes associated with each GO category and those not in the category. Many genes in GO categories associated with essential cellular functions related to transcription, translation, cell cycle, and energy metabolism, among others, appeared to be under stronger selection in both sexes (Supplementary file 1G, Figure 13).

Strength of apparent stabilizing selection in females and males.

(a) Vm/Vg for gene expression in females are plotted against that in males. (b) Boxplots of Vm/Vg for sex-specific (S) and non-specific (NS) genes. Within each sex, Vm/Vg for sex-specific and non-specific genes are compared using Wilcoxon’s rank sum test.

https://doi.org/10.7554/eLife.14625.013
Strength of apparent stabilizing selection on autosomes and X chromosome.

Boxplots of Vm/Vg are plotted for each chromosome in females and males. Wilcoxon’s rank sum test is used to compare Vm/Vg on autosomes (A) and on X chromosome (X).

https://doi.org/10.7554/eLife.14625.014
Strength of apparent stabilizing selection differs for genes in different functional categories.

Boxplots of Vm/Vg for genes in selected Gene Ontology (GO) categories as compared to all genes.

https://doi.org/10.7554/eLife.14625.015

Discussion

The origin and maintenance of quantitative trait variation are fundamental problems in evolutionary biology and have profound implications in agriculture and medicine, where most economically and medically relevant traits are quantitative in nature. Mutations are the ultimate source of genetic variation, but they are rare and constantly being removed by natural selection and genetic drift. The purifying property of both selection and drift on mutations makes it especially difficult to study the characteristics of spontaneous mutations in natural populations because it is difficult to attribute a mutational pattern to either one of these evolutionary forces. In this study, we combined the classical mutation accumulation design, which subjects inbred lines of Drosophila melanogaster to genetic bottlenecks for many generations, with modern high throughput technologies. This allowed us to largely separate the effects of genetic drift from the combined effects of both selection and drift, a key advantage that is not possible from observations only on natural populations.

Our estimates of mutation rates could be biased either upwards or downwards, depending on the initial fitness of the inbred line. On the one hand, lethal and highly deleterious mutations would not accumulate in these lines. On the other hand, beneficial mutations fix at a higher probability than the neutral probability assumed. Although spontaneous mutations occurred at a relatively low rate of 6.60 × 10–9 averaged over autosomes and the X chromosome, they replenished genetic variation at a significant rate for gene expression (sex averaged rate = 0.65 × 10–3Ve per generation) and organismal traits (sex averaged rate = 1.36 × 10–3Ve per generation). Our estimates of hm2 for gene expression traits were higher than observed in previous studies (Rifkin et al., 2005), possibly due to smaller technical variation and thus smaller Ve in this study. Although the MA lines were derived from the same progenitor line and were raised under homogeneous conditions, mutation rates among the MA lines varied significantly. This implies either that non-genetic factors affect mutagenesis and/or that some spontaneous mutations themselves affect mutation rate; the latter is less plausible because such mutations must occur early in many lines to have a pronounced effect. The MA lines accumulated a broad functional spectrum of spontaneous mutations, including a significant fraction of high impact mutations that would otherwise likely be removed by natural selection. The relatively small number of mutations and the large number of genes with significant mutational variance in expression implies pervasive pleiotropic effects by new mutations. Gene expression traits often form co-expression modules (Ayroles et al., 2009), and therefore mutations that directly influence expression of a small number of loci can cause secondary trans effects at a much larger number of genes. Taking this networked view of gene expression traits, any selection will not act on individual traits, but rather on the combined effects of all traits. Consistent with this notion, we found stronger selection on genes that would cause trans pleiotropic effects such as transcription factors (Supplementary file 1G).

Because of the scale of experiment and the large number of MA lines needed to accurately estimate mutational variance, we chose an experimental design that focused on a single inbred strain to derive a relatively large number of MA lines. This design, however, does not allow us to infer the effects of genetic background on the rates of mutation and introduction of genetic variation. There is limited evidence for genuine genetic variation for mutation rate. However, this is primarily due to technical limitations because mutation rate is sensitive to environmental and physiological factors that cannot be easily controlled and it is prohibitive to collect genetic data on mutation rates in natural populations. The effects of genetic background can be studied with a modified design using the DGRP by deriving MA lines from multiple genetically diverse inbred strains and controlling environments. Because of the close proximity of mutation rates in this study and earlier studies of the same species, the conclusions drawn in this study are likely to hold across genetic backgrounds, especially given the magnitude of apparent stabilizing selection, which is unlikely to be attributable to genetic factors unique to the ancestral line.

It is entirely possible that mutation rates may be different in inbred conditions than in natural populations, which has significant theoretical consequences (Agrawal, 2002). While we could not formally exclude the possibility, the mutational variance we observed in the MA lines appears to be too large (by a few orders of magnitude) to be solely explained by elevated mutation rates under stressful conditions.

The issue of why genetic variation for quantitative traits segregates at appreciable levels in natural populations despite the tendency of genetic drift and directional and stabilizing selection to erode it remains an unresolved puzzle. Our data unequivocally reject the neutral mutation – random drift balance model for maintenance of quantitative genetic variation (Lynch and Hill, 1986). The observed magnitude of segregating variation is much less than predicted under this model given our estimates of population size in nature and mutational variation. Therefore, we inferred that a form of stabilizing selection on a large fraction of gene expression as well as organismal traits constrains naturally occurring genetic variation, consistent with earlier studies using a similar analysis in C. elegans (Denver et al., 2005) and inter-specific variation in Drosophila (Rifkin et al., 2005).

Theoretical models of maintenance of quantitative genetic variation by mutation – stabilizing selection balance assume either direct stabilizing selection on each trait or stabilizing selection as a deleterious pleiotropic side-effect of new mutations on fitness (Johnson and Barton, 2005). The former class of model has different quantitative predictions depending on the relative magnitude of mutation and selection. The house-of-cards approximation holds when mutation rates are low, mutational effects are large, and selection is strong (Turelli, 1984); while the Gaussian approximation holds under conditions of weak stabilizing selection, high mutation rates and small mutational effects (Lande, 1975). Our observations of low mutation rates and mutational effects that are much larger than standing DNA polymorphisms favor the former parameterization. Under this model, Vg=4nμVs, where n is the number of loci potentially affecting the trait, μ is the mutation rate, and Vs represents the strength of stabilizing selection (Turelli, 1984). Assuming strong direct stabilizing selection (e.g. Vs=20Ve), high heritabilities (e.g. Vg=Ve) and our estimated mutation rate (μ=6.60 × 109), then n=1.9 × 106, which seems implausibly large given the total size of the D. melanogaster euchromatic genome of approximately 1.2 × 108.

Under simple pleiotropic models of apparent stabilizing selection (all mutations are equally deleterious and have a reduction of heterozygous fitness of s, and the strength of selection is strong (~10–2) against new mutations) the equilibrium genetic variance is Vg=Vm/s and s=Vm/Vg. On average, our estimate of Vm/Vg had a median 1.94 × 10–3 for organismal traits and 1.82 × 10–3 for gene expression traits. Thus, while we detect apparent stabilizing selection on quantitative traits, it is too weak by an order of magnitude for the majority of genes and quantitative traits to be consistent with observed selection against heterozygous effects of new mutations (Mukai et al., 1972; Mackay et al., 1992). Alternatively, the rate of generation of new mutations is too weak a force to counter strong apparent stabilizing selection, which removes variation faster than it is generated.

Our estimates of μ, Vm and Vg do not alter the conclusion that simple mutation – stabilizing selection models for either direct or apparent stabilizing selection cannot maintain the observed amounts of segregating genetic variation with the observed mutational input and strong selection: the mutational variance is too low and/or the standing genetic variance is too high (Turelli, 1985; Hill and Keightley, 1988; Barton and Turelli, 1989; Zhang et al., 2002; Turelli and Barton, 2004). It is possible that stabilizing selection in nature is weaker than assumed (Kingsolver et al., 2001; Kingsolver and Diamond, 2011), and other mechanisms such as balancing selection, fluctuating allelic effects in the face of temporally and spatially varying environments, canalization of mutational effects, and a combination of direct and apparent stabilizing selection all contribute (Barton, 1990; Zhang et al., 2002; Turelli and Barton, 2004; Zhang and Hill, 2005). Ultimately, artificially introducing individual mutations or combinations of mutations and assessing their effects, which is now feasible, will be needed to understand the balance between mutations and selection in maintaining segregating genetic variation for quantitative traits.

Materials and methods

Mutation accumulation lines

The D. melanogaster strain DGRP_360 was generated by 20 generations of strict full-sib mating from an isofemale line derived from the Raleigh, NC USA population (Mackay et al., 2012). We divided DGRP_360 into 25 replicate sublines, each maintained at a census population size of 10 virgin females and 10 males per generation. All lines were maintained in shell vials with 10 ml cornmeal-agar-molasses medium at 25°C, 70% humidity and 12 hr/12 hr light/dark cycle.

DNA extraction, library preparation and sequencing

Genomic DNA was extracted from 100 female flies per MA line at generation 60 using the QIAGEN Genomic-tip 100/G kit (Qiagen, Valencia, CA). The flies were homogenized with a mortar and pestle to a fine white powder using liquid nitrogen and lysed for 2 hr (50°C) with Buffer G2 supplemented with RNAse A (1.5 mg) and Proteinase K (12 mg). The samples were centrifuged at 7000 x g for 30 min at 4°C and the clear lysates applied to a Genomic-tip 100/G that had been equilibrated with Buffer QBT. Once the lysates had passed through the columns, the columns were washed twice with Buffer QC. Genomic DNA was eluted from the column with Buffer QF, precipitated with 100% isopropanol and the DNA pellets washed with 70% ethanol. The genomic DNA pellets were re-suspended in 130 μL of nuclease-free water. Purified genomic DNA (1 μg) was fragmented to an average size of 300–400 bp using Covaris shearing (Covaris, Woburn, MA). DNA libraries were prepared from the fragmented DNA using the Illumina TruSeq DNA Sample Preparation Kit (Illumina, San Diego, CA) by following the manufacturer’s procedure. The fragmented DNA was subjected to end-repair, adenylation of 3’-ends, ligation of indexed paired-end adapters and PCR-enrichment of the barcoded DNA. The libraries were quantified by qPCR using the KAPA SYBR FAST Master Mix Universal 2X qPCR Master Mix (Kapa Biosystems, Wilmington, MA). The sizes of the PCR-enriched libraries were verified by Bioanalyzer using the high sensitivity DNA chip (Agilent, Santa Clara, CA). We multiplexed and sequenced 5 libraries per lane on the HiSeq 2000 (Illumina). Sequence data are deposited to the NCBI SRA database with the accession number for the BioProject SRP068116.

Detection of spontaneous mutation in mutation accumulation lines

Sequence reads from the parental line (DGRP_360) and each of the 25 MA lines were aligned to the Drosophila melanogaster reference genome (BDGP5) using BWA-MEM with default parameters (Li, 2013). Alignments were locally realigned around known indels in the DGRP and around target regions identified across all samples using GATK (DePristo et al., 2011). After PCR duplicate removal and base quality recalibration using GATK, overlapping bases from paired end reads were clipped using bamUtil (http://genome.sph.umich.edu/wiki/BamUtil). Finally, alleles (mapping quality ≥13, base quality ≥13) were piled up using freebayes(Garrison and Marth, 2012). Only lines whose median filtered coverage was above 15 were considered for mutation detection. We considered 109,260,235 sites where between 15 and 250 reads were observed in the parental line and in at least 10 MA lines, and no more than 10 possible alleles were observed in all lines combined. A mutation was called if: (1) no read supported the mutant allele in the parental line; (2) the p value for a Fisher’s exact test assessing strand bias of alleles was conservatively >0.001; (3) no more than two possible alleles were observed in the mutant line; (4) the mutant allele frequency was greater than 20% in the mutant line; (5) no line other than the mutant line contained the mutant allele at frequency greater than 5%; (6) no more than two other lines contained any reads supporting the mutant allele.

Validation of mutations by Sanger sequencing

To validate mutations, we sequenced PCR fragments flanking 51 randomly selected mutations in pooled DNA from five MA lines (MA02, MA11, MA13, MA17, and MA24) in G130 using Sanger sequencing. We sequenced DNA in G130 to allow sufficient time for mutations to fix, because Sanger sequencing cannot readily distinguish low frequency polymorphisms from background noise and is subject to bias in PCR amplification of alleles. On average, mutations fix at a probability equal to their initial frequencies after 4Ne generations. Genomic DNA was extracted from 20 females from the MA lines using the Gentra Puregene Tissue Kit (Qiagen). The flies were homogenized with 2 spherical ceramic beads (MP Biomedical) using the TissueLyser (Qiagen) and lysed for 1 hr (56°C) with Cell Lysis Buffer supplemented with RNAse A (1.5 mg) and Proteinase K (12 mg). Proteins were removed from the clear lysates with protein precipitation solution followed by centrifugation. Genomic DNA was precipitated with 100% isopropanol and the DNA pellets washed with 70% ethanol. The genomic DNA pellets were re-suspended in 100 ul of nuclease-free water. Each sample was diluted to 5 ng/ul and subjected to PCR using 95 primer pairs (Supplementary file 1B) with the following cycling parameters: 95°C for 2 min followed by 30 cycles of 95°C / 30 s + 56°C / 30 s + 72°C / 30 s followed by a final extension step at 72°C for 4 min. The PCR products were purified using the PureLink Pro 96 PCR Purification Kit (Life Technologies, Carlsbad, CA) and sequenced with each corresponding forward primer using the BigDye Terminator Cycle Sequencing Kit (Life Technologies).

Estimation of mutation rate in MA lines

To estimate mutation rate in the MA lines, we first inferred the effective population size using a maximum likelihood approach, assuming that synonymous and non-exonic SNPs are neutral and effective population size and mutation rate stays constant over time and among MA lines. For each 2Ne value ranging from 10 to 40, we simulated 1,000,000 MA lines where one mutation occurred independently per line per generation. At G60, frequencies of all 60,000,000 unlinked mutations were summarized (based on samples of 200 chromosomes in the last generation) to obtain the expected frequency distribution of mutations under a given 2Ne at G60, which allowed us to calculate the likelihood of observing the frequency distribution of neutral sites. We calculated the multinomial likelihood L=(Mm1, m2, , m16)i=116Pimi of the observed mutant frequency distribution, where i indexed one of the 16 equally sized allele frequency bins between 0.2 and 1 ((0.2, 0.25], (0.25, 0.30), …, (0.95, 1]). Pi is the probability of observing a mutant allele in the ith bin given the expected mutant frequency distribution based on simulation, and m1,m2, , m16  are the number of mutations in each bin and summed to M. We inferred effective population size for autosomes and X chromosomes separately. Because of the small number of mutations in each line, the mutant frequency distribution was summarized across all 23 lines to provide an overall estimate of 2Ne across all lines. Mutation rate in each line was then estimated as μ=mt*2Ne*p*B, where m is the number of mutations, t=60 is the number of generations, 2Ne is the effective population size, p is the estimated marginal probability of a mutation attaining 0.2 frequency given 2Ne, and B is the number of bases considered for mutation calling in that line.

Acquisition of microarray data

At G60 we assessed whole genome transcript profiles of the 25 MA lines for 3–5 day old males and females, with two biological replicates per sex and line, using Affymetrix Drosophila 2.0 arrays. All samples were harvested between 9–11 am. Whole bodies of 10 flies per sample were homogenized with 1 mL of QIAzol lysis reagent (Qiagen) and two ¼ inch ceramic beads (MP Biomedical) using the TissueLyser (Qiagen) adjusted to a frequency of 15 Hz for 1 min. Total RNA was extracted using the miRNeasy 96 kit (Qiagen) with on-column DNAse I digestion and following the spin technology protocol as outlined in the manufacturer’s manual. The RNA was eluted with 45 μL of RNAse-free water. Total RNA was quantified using a NanoDrop 8000 spectrophotometer (Thermo Scientific, Carlsbad, CA). The 100 RNA samples were processed at all stages in a strict randomized design. Fragmented biotin-labeled aRNA were prepared for hybridization to GeneChip Drosophila Genome 2.0 arrays as described in the GeneChip 3’ IVT Express Kit user manual (Affymetrix P/N 702646 Rev.5). Briefly, 200 ng of total RNA was reverse-transcribed to synthesize first-strand cDNA. The cDNA was converted to double-stranded DNA and used as a template for in vitro transcription to synthesize biotin-labeled aRNA. The aRNA was purified using magnetic RNA binding beads and quantified using a NanoDrop 8000 spectrophotometer (Thermo Scientific). 12 μg of purified biotin-labeled aRNA were fragmented and hybridized to GeneChip Drosophila Genome 2.0 arrays (Affymetrix). All microarray data have been deposited to ArrayExpress with accession number E-MTAB-4117.

Preprocessing of microarray data

In addition to the 3’ IVT array data that were generated in this study, we downloaded raw microarray expression data from a previous study profiling the transcriptomes of 40 DGRP lines (Ayroles et al., 2009) (ArrayExpress E-MEXP-1594), another profiling the transcriptomes of a synthetic outbred population derived from the same 40 DGRP lines under different environmental conditions (Zhou et al., 2012) (ArrayExpress E-MTAB-639). Arrays for DGRP_514, MA-21 and MA-23 were not considered because we lacked sequence information for these lines. Probe intensities were corrected for background hybridization using GCRMA (Wu et al., 2004) and quantile normalized within each sex. We lifted the probe target alignment to FlyBase annotation (Release 5.57) and retained only those that mapped entirely (can span exon-exon junctions) to constitutive non-overlapping exons and contained no DNA variation among the MA lines or the 40 DGRP lines. The normalized probe intensities were log 2 transformed and estimates of gene expression were obtained using median polish, which adjusted probe effect and removed potential outliers. We did a preliminary normalization and stringently removed arrays that appeared to be contaminated by flies of the opposite sex or contained more than 1% of genes whose expression was more than five standard deviations higher or lower than the mean expression of the same genes across all arrays. A total of 23 out of the 368 arrays were removed before a final normalization was performed on the remaining 345 arrays.

Analysis of gene expression variance

We analyzed gene expression for each sex within each experiment separately. For each gene, we first transformed the data to be normally distributed by taking the quantiles of a normal distribution with the mean equal to the median expression and the variance equal to the square of the median absolute deviation (by a scaling factor of 1.4824 to account for the expected difference of median absolute deviation and variance, which has no effect on the statistical inference). We defined a gene as expressed if its expression level was above 3.71 in females or above 4.26 in males. These cutoffs were chosen such that by modeling the expression profile as a mixture of two normal distributions, for a gene with expression higher than the cutoff, its probability of belonging to the high expression group was at least twice that of belonging to the low expression group (Figure 14). Partition of variance into between-line and within-line variances was performed using the lme4 package in R. It is important to note that the microarray data were collected in three studies such that data sources were completely confounded with the experiments (MA, DGRP, and environmental exposure). Importantly, all estimation of parameters was performed within each experiment such that any between-experiment batch effects that shifted genome-wide gene expression profiles would not affect our results. Batch effects that increased or decreased variances in gene expression, however, may affect the results. Nonetheless, as line or treatment was randomized within each experiment, any such batch effect on variance would be much more likely to affect the within-line variance than the among-line variances that were used for comparison. More importantly, the magnitude of differences in variances, which was also consistent with other studies that assessed the differences within a single batch (Denver et al., 2005; Rifkin et al., 2005), was too large to be explained by a batch effect.

Classification of genes as expressed or not expressed in each sex.

Within each sex (females in a, males in b), the distribution of median expression for each gene across the MA lines is subject to a mixture model analysis with two normal components. The mean and variance of each component distribution is estimated using the mixtools package in R. A gene is called expressed if its posterior probability of belonging to the normal distribution with the larger mean is higher than 2/3. The histograms show the observed distribution and the estimated normal distributions and their mixture.

https://doi.org/10.7554/eLife.14625.016

Analysis of organismal quantitative trait variance

All organismal traits were scored at generation 61, one generation after the DNA was sequenced and RNA was analyzed. Abdominal bristle number is the sum of the numbers of abdominal chaetae on the two most posterior abdominal sternites, and sternopleural bristle number is the sum of the total number of macrochaetae and microchaetae on the left and right sternopleural plates. Bristle numbers were scored on 10 males and 10 females from each of two replicate vials per line (total N=1,000). Five sleep traits (total sleep duration during the night and day, numbers of sleep bouts during the night and day, and total waking activity) (Harbison et al., 2009) were measured on 3–5 day old flies, for 16 virgin males and 16 virgin females per line (N=800). Prior to sleep measurement, all flies were maintained at a constant density of 30 flies per same-sex vial to mitigate the effects of both social exposure and mating on sleep. We recorded seven continuous days of sleep and activity using the Drosophila Activity Monitoring System (Trikinetics, Waltham, MA), which measures the numbers of times each fly crosses an infrared beam. Data from flies that did not survive the entire recording period were not used in the sleep calculations. Sleep duration was calculated as any period of inactivity lasting at least five minutes. Waking activity was calculated as the number of times the fly crossed the infrared beam divided by the total time awake. We also measured bristle numbers in the 39 DGRP lines for which we had expression data or obtained the sleep phenotypic data for the same lines from a previous study (Harbison et al., 2009). The same analysis as the gene expression data was performed to partition variance for the organismal quantitative traits for both MA and DGRP lines, except for bristle numbers, for which an additional between-replicate term was also included.

Variance in quantitative traits due to mutations, standing variation, and environmental variation

The between-line variance for MA lines VMA under a neutral and polygenic model is approximately, VMA=kVm, where k=2[ t(2Ne5)(1et2Ne) ], t=60 (for gene expression traits) or 61 (for organismal traits) is the number of generations since the lines were established, Ne is the effective population size (2Ne=21) by taking the average of autosomes and X chromosomes) and Vm is the mutational variance (Lynch and Hill, 1986; Mackay et al., 1992). We scaled Vm by the environmental variance Ve to obtain the mutational heritability hm2=Vm/Ve. For gene expression traits, the within-line variance is Ven+Vr, where n=10 is the number of flies per sample and Vr is the technical variation or measurement error of microarrays. Vr is believed to be small thus we multiplied the within-line variance by n to obtain an upwardly biased approximation of Ve, which underestimates hm2. For organismal traits, Ve was assumed to be the within-line variance (sleep traits) or the sum of within-line and between-replicate variances (bristle numbers).

At equilibrium, the among-line variance between a set of inbred lines such as the DGRP (Vg) is approximately 4NVm, where N is the effective population size (Lynch and Hill, 1986) of the wild population from which the DGRP was derived. Therefore VmVg=14N when quantitative traits evolve neutrally. We used π=4Nμ=4.92 × 103 in the DGRP to estimate N. Because whole genome sequences are available, we can also derive the amount of quantitative trait variance due to sequence divergence, without the assumption of neutrality. The expectation for among-line variance for a trait is Vg= i=1GE[Var(aigi)]=i=1GE(ai)Var(gi)=E(ag2)i=1GVar(gi), where ai is the allelic effect expressed as a deviation from the ancestral allele at each locus and has an expectation of E(ag2), gi is the number of copies of the mutant allele in the ith line and can take a value in the range of 0 and 2, with numbers between 0 and 2 (twice of the segregating frequency in that line) used to represent lines where the alleles are still segregating, and Var(gi) is simply the sample variance of gi. This formulation is insensitive to the sign of a and therefore does not require polarization of alleles in the DGRP. For MA lines, mutations below a frequency of 0.2 were randomly drawn from the expected distribution based on the inferred effective population size. The expectation of among-line variance follows the same form: Vm= i=1ME[Var(aimi)]/k=i=1ME(ai)Var(mi)/k=E(am2)i=1MVar(mi)/k, where ai is the allelic effect of the mutation and has an expectation of E(am2), mi measures the number of mutant alleles in the MA line, and k is as defined above. Therefore VmVg=E(am2)E(ag2) Var(mi)/kVar(gi), and the ratio of Vm to Vg measures the difference between E(am2) and E(ag2). If the effect size distributions were equal, VmVg=Var(mi)/kVar(gi). Finally, we estimate variance in gene expression traits due to environmental variation by the among-treatment variance (VENV) in a study of DGRP derived flies subject to 20 diverse and potentially harsh environments (Zhou et al., 2012).

References

  1. 1
  2. 2
  3. 3
  4. 4
    Pleiotropic models of quantitative variation
    1. NH Barton
    (1990)
    Genetics 124:773–782.
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
    arXiv.org
    1. E Garrison
    2. G Marth
    (2012)
    Haplotype-based variant detection from short-read sequencing, arXiv.org, https://arxiv.org/abs/1207.3907.
  10. 10
  11. 11
  12. 12
    Interrelations of mutation, population size, artificial and natural selection
    1. WG Hill
    2. PD Keightley
    (1988)
    Proc Second Int Conf Quant Genet pp. 57–70.
  13. 13
    Comparing mutational variabilities
    1. D Houle
    2. B Morikawa
    3. M Lynch
    (1996)
    Genetics 143:1467–1483.
  14. 14
  15. 15
    Theoretical models of selection and mutation on quantitative traits
    1. T Johnson
    2. N Barton
    (2005)
    Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences 360:1411–1425.
    https://doi.org/10.1098/rstb.2005.1667
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
    Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM
    1. H Li
    (2013)
    arXiv Prepr arXiv 00:3.
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
    Mutation rate and dominance of genes affecting viability in Drosophila melanogaster
    1. T Mukai
    2. SI Chigusa
    3. LE Mettler
    4. JF Crow
    (1972)
    Genetics 72:335–355.
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
    Effects of pleiotropy on predictions concerning mutation-selection balance for polygenic traits
    1. M Turelli
    (1985)
    Genetics 111:165–195.
  34. 34
  35. 35
  36. 36
    Pleiotropic model of maintenance of quantitative genetic variation at mutation-selection balance
    1. XS Zhang
    2. J Wang
    3. WG Hill
    (2002)
    Genetics 161:419–433.
  37. 37
  38. 38

Decision letter

  1. Molly Przeworski
    Reviewing Editor; Columbia University, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Spontaneous Mutation and the Origin and Maintenance of Quantitative Genetic Variation" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Diethard Tautz as the Senior Editor.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

The following individuals involved in review of your submission have agreed to reveal their identity: Bill Hill (peer reviewer) and Nick Barton (peer reviewer).

As you will see, all three reviewers agreed that the study represented a highly substantial undertaking and that while the results are not surprising, they are an important addition to the literature (as are the data). However, the reviewers also had a number of suggestions that must be addressed for the paper to be published in eLife.

1) The substantive concern is that only one MA line is used to estimate the mutation rate and thus that, in the presence of mutation rate variation among lines, the comparison to the DGRP lines is problematic (see Reviewer 2's comments).

This concern needs to be addressed, minimally by making this limitation of the study design explicit in the Discussion, and ideally by providing some quantification of its possible effect.

2) Two reviewers point out that the term "stabilizing selection" is too loosely used.

3) Perhaps most importantly for the dissemination of the work, the paper is not written in a manner accessible to people outside the field. As suggested by Reviewer 3, the expectations need to be more clearly set up, and the methods described in greater detail.

The reviewers also had a number of other suggestions to be taken into account; in particular, it would be helpful to discuss the possibility that MA lines have higher mutation rates than do natural populations. These follow.

Reviewer #1:

This paper does three things:

1) It estimates the spontaneous mutation rate in Drosophila from a collection of 25 sixty generation mutation accumulation lines;

2) Measures Phenotypes for several characters (gene expression and a half dozen or so quantitative traits);

3) Measure the SAME set of phenotypes on DGRP lines.

An estimate of the spontaneous mutation rate (at SNPs and small INDELs) is obtained. Furthermore the variation in measured quantitative traits in the MA lines relative to DSPR lines is too large to be consistent with those traits evolving neutrally.

In terms of estimating the mutation rate in a collection of MA lines. This experiment has been carried out in several other systems and even in D. mel. (as the authors point out; Keightley et al. 2009 and Schrider et al. 2013) – albeit the oft cited D. mel study of Keightley only examines 3 MA lines. Nonetheless, it does not appear that the estimate of mutation rate dramatically changes.

The second conclusion that quantitative traits (QTs) must be under stabilizing selection (because the ratio of natural variation to variation in the MA lines is too low) is not all that surprising. Studies have already concluded as much for gene expression traits by comparing standing variation to divergence (Rifkin et al., 2005 seems to be the go to citation for this). And of course it is not surprising that the quantitative traits are under stabilizing selection. Bristle are sort of a classic QT, that they are under stabilizing selection is discussed in the "textbook" of Falconer and Mackay. It is perhaps a novel observation that sleep and activity traits are under stabilizing selection, but not really surprising (thinking of humans sleep time has a mean of 8 hours and SD of 1 hour, it is easy to imagine sleeping 6 SD's more or less is highly deleterious).

Barring these earlier studies being flawed in some way, this paper doesn't really cover new ground. Overall this paper doesn't really make any novel and/or surprising discoveries, unless I am missing something.

That being said, a really valuable aspect of the paper is that everything is done better than previous studies. And the same MA lines used to measure mutations are used to measure phenotypes, and the same set of phenotypes is measured in the same way in the MA lines and DGRP lines. Furthermore the PI (Mackay) has an incredible track record of quality control, and putting the data out there so anyone can easily download and use it. This work seems like no exception. So what we have is a much better and cleaner dataset that is freely available and useful, and in my mind this is an important contribution.

Reviewer #2:

In this paper from Trudy Mackay's group, a highly inbred line which is one of the DGRP inbreds of Drosophila derived as an iso-female line from an outbred population, is used as source from which 25 replicate mutation accumulation (MA) lines were derived and maintained for 60 generations of random mating with 10 parents of each sex. Observations on a number of quantitative traits including bristle number and gene expression and DNA sequences were obtained on all (or nearly all) MA lines. Thus the authors could estimate spontaneous mutation rates for variables ranging from molecular sequence to phenotypes of quantitative traits. The variances and distributions of traits among these lines were compared with those among the DGRP lines and differences used to infer the degree of selection operating in the new lines.

The whole idea seems to me a very nice one, and the scale of this experiment is enormous. The results, whatever they were, are bound to be of general interest and by their analyses and interpretation they add a lot more. I strongly recommend publication, after a relatively small amount of revision. I do, however, have a number of comments, the first being what I think is my main concern with the paper and could be argued is a fatal flaw.

1) A single line (DGRP 360) was used to generate the new MA lines. One significant observation from these lines was that, inter alia, the base mutation rates differed among the new MA lines (Figure 6c). This of course implies that comparisons between mutation rates for traits such as bristle number also presumably vary among lines. The evidence for and estimates of e.g. stabilising selection on the phenotypic traits, notably bristle number, were obtained by comparing the variance among the DGRP lines with that predicted from these mutation rates in the MA lines. As there is evidence of between line variance in the MA lines derived from a single DGRP line, there is every reason to expect that there would be differences in the mutation rates among the DGRP lines themselves. Hence the estimates of mutation rate and of stabilising selection effects in comparing the MA and DGRP lines is, presumably, sensitive to choice of DGRP line.

I think the authors need to consider what I think is a serious flaw in the design of the experiment and thus underestimation of the sampling variance and overestimation of significance levels of some of the results. A better design would, in some respects, have been to have taken a subset of MA lines from several (a sample of) DGRP lines to assess and allow for genetic variation in mutation rates among the DGRP lines. There are obviously other benefits in terms of comparisons in sequence etc. by just using one DGRP line, but the problem of lack of replication remains nevertheless. At the very least the authors need to discuss this: the flaw will be apparent to others also.

Reviewer #3:

This is a very substantial study that measures accumulation of mutations in both sequence and quantitative traits in the same mutation accumulation lines, and compares them with standing variation in the original natural population. The analysis is fairly sophisticated (though not well explained), and the estimates are remarkably close to those from previous studies, and estimated by less direct methods.

Overall, this is a large-scale study that addresses an important issue: whether quantitative variation is maintained by a balance between mutation and selection. The results are not surprising, and support the consensus view, but that should not count against this work. On balance, I think that it should be suitable for eLife, but it does need some considerable revision – mostly, to explain what was done more clearly to a reader who has not been immersed in these issues.

Generally, the paper is written for readers thoroughly familiar with these kinds of experiment. There is some explanation in the Discussion, but it needs to come much earlier. There needs to be a clearer statement early on about what is expected in mutation accumulation experiments. Highly deleterious mutations will not be seen, but weakly deleterious mutations will accumulate in the MA lines, and will be more frequent than in nature.

The estimates here do not include the fraction of strongly deleterious mutations. In particular, one expects recessive deleterious mutations to potentially rise to moderate frequency in these inbred lines. This would not necessarily show up in Figure 1 if the bulk of mutations are nearly neutral, but could be tested for by asking whether the MA lines can be made homozygous. A comment at least is needed on this issue.

A pervasive problem is the use of the term "stabilising selection" to refer to any kind of purifying selection. It would be clearer to restrict it to selection on a continuous trait towards an intermediate optimum, and it needs t be stead always whether this is selection on the observed or on hidden traits. In any case, it is simpler to see Vm/Vg as an estimate of net selection against the alleles that contribute trait variance, and avoid speculation about whether this acts via stabilising selection (in the sense defined here).

Results, first paragraph: It would be interesting to compare the frequency of accumulated mutations with the spontaneous mutation rate, measured in other studies that compare parents with offspring – the difference being that the MA lines will not contain highly deleterious mutations.

A serious worry is whether mutation rate is higher, and different in kind, in inbred lines than in the wild population – cf Agrawal's recent work.

Results,second paragraph: "Generally agreed well" is rather vague – one needs a statistical test of whether the rate and spectrum of mutations is the same as in previous work (as SI)

Subsection “Spontaneous mutation rate”: Given the population size, what is the fixation probability as a function of h and s? As it is, the reader has to work this out for herself.

In the same section, there should be a bit more explanation of how the mutation rate is estimated in the main text (a sentence, at least). This is not at all straightforward.

Subsection “Rate of introduction of genetic variation by spontaneous mutations”: What is meant by "global gene expression"? Even after reading the Methods, I could not work this out. It seems bizarre to pool the expression level of all the genes, because, I assume, expression of individual genes will be analysed elsewhere. Still, it is not clear what "global gene expression" is measuring, especially since the expression measurements were made separately in the three experiments and so are not directly comparable.

Second paragraph of the same section: What are the units of measurement here? It is not obvious how to compare variance in gene expression, but the definition of the trait is not even stated in the main text. In the Methods, it emerges that there has been a very obscure normalisation procedure. Are we comparing the variance of absolute values? The variance on a log scale? The variance relative to Ve? This is crucial to how we interpret these variances.

Subsection “Strong stabilizing selection on quantitative trait variation”: Seeing lower standing variance than expected under neutrality does not imply stabilising selection; there could instead be purifying selection against alleles that have pleiotropic effects on the measured traits. The term "stabilising selection" needs to be defined carefully.

In the same section, the comparison of E(am2) vs E(ag2)is confusing. The implicit assumption here is that stabilising selection acts on the observed traits, which is wildly implausible. Rather, Vm/Vg gives an estimate of the net selection on each allele. This selection may be due to stabilising selection on unobserved traits, and/or to direct selection against the alleles. (The distinction here depends of course on what "traits" one defines). It is confusing to use the term "stabilising selection" here.

The basic theory is explained in the Discussion but needs to come much earlier to avoid confusion.

Discussion, paragraph four: The estimate of selection from Vm/Vg is an order of magnitude smaller than estimates of selection against deleterious mutations generally. However, this is consistent with a broad distribution of effects on fitness and on traits: the alleles that contribute to genetic variance will tend to be less deleterious than spontaneous mutations. Also, of course, balancing selection may contribute genetic variance.

Subsection “Estimation of mutation rate in MA lines”: Does the actual frequency distribution of (presumed) neutral sites actually fit the simulation model? Figure 6 suggests yes, but this is not commented on in the text.

It seems rather inefficient to just record the number of mutations at p>0.2 – surely using the actual frequency would be more informative? One would also like to see whether non-synonymous mutations are less likely to reach high frequency, as expected if they are recessive deleterious.

Subsection “Variance in quantitative traits due to mutations, standing variation, and environmental variation”: This formula for VMA makes no sense; it will be negative for large times. There is no explanation of where it comes from.

It is annoying that the figures are all given at the end, but are separated from their captions, and are not numbered.

Figures 7 and 9 are especially inscrutable.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Spontaneous Mutation and the Origin and Maintenance of Quantitative Genetic Variation" for further consideration at eLife. Your revised article has been favorably evaluated by Diethard Tautz (Senior editor), a Reviewing editor, and three reviewers.

All agree that the manuscript has been improved. There are a couple of remaining points that still need to be addressed (see Reviewer 3's comments), either through revisions to the writing or in a brief note explaining why you do not agree with the concern.

See the specific comments below:

Reviewer #1:

All my concerns were addressed in the response and via the inclusion of the new Figure 1.

Reviewer #2:

This is a revision of a major paper I refereed earlier. I am satisfied with the revision, recognising that the major problem about replication of source lines mutation accumulation is not feasible at this stage, and arguably never was in view of resources needed. The authors respond adequately on this point.

It should now be published.

Reviewer #3:

The revision more or less answers my criticisms, though the changes are minimal. I accept that starting from a single line was a reasonable choice, given the effort involved, and this is now explained better. However, I am still not satisfied on a couple of points:

The revisions have been mostly to better explain the experimental design. However, my original criticism was more that the underlying theory is not set out well at the beginning: this has not been addressed.

A specific problem is with a key interpretation of the results: in paragraph seven of the Discussion. it's stated that the estimated selection is too weak to be consistent with classical estimates of selection against alleles that influence viability. I don't think this necessarily follows. If there is a distribution of effects on traits, then the ratio Vm/Vg will be smaller for traits that are less closely connected with fitness. With rare alleles, and assuming mutation/selection balance, the selection estimated from Vm/Vg equals the harmonic mean s, weighted by the squared affect on traits. Thus, of the effect on the trait is not correlated with fitness, we have 1/<1/s>, whereas if the trait is fitness itself, we have /, which must be larger. So, the differences in Vm/Vg between viability and gene expression are surely to be expected?

These issues are discussed in Charlesworth's recent PNAS review, which should be not just cited, but used to frame the interpretation of these data.

A point of terminology: I don't think that "apparent stabilising selection" is used correctly. To me, this refers to the reduction in fitness of extreme phenotypes that is observed even for a neutral trait, when trait variation is due to pleiotropic effects of mutations. However, here "apparent SS" refers to the ratio Vm/Vg under this pleiotropic model. I think that "purifying" or "pleiotropic" selection would be a better term to use.

https://doi.org/10.7554/eLife.14625.022

Author response

As you will see, all three reviewers agreed that the study represented a highly substantial undertaking and that while the results are not surprising, they are an important addition to the literature (as are the data). However, the reviewers also had a number of suggestions that must be addressed for the paper to be published in eLife.

1) The substantive concern is that only one MA line is used to estimate the mutation rate and thus that, in the presence of mutation rate variation among lines, the comparison to the DGRP lines is problematic (see Reviewer 2's comments).

This concern needs to be addressed, minimally by making this limitation of the study design explicit in the Discussion, and ideally by providing some quantification of its possible effect.

We added discussion to the manuscript about this limitation (please also see below our response to each of the reviewers’ comments). The main consideration at the time of designing the experiment was to balance cost and accuracy. While there could be genetic variation in mutation rates, we were more concerned with obtaining accurate estimates of mutational variance across a large number of traits, which required a relatively large number of MA lines. As it turns out, the strength of stabilizing selection inferred based on our mutational variance estimates was too large to be explained by genetic variation in mutation rates.

2) Two reviewers point out that the term "stabilizing selection" is too loosely used.

We have now revised extensively the entire manuscript to improve clarity in what we mean by stabilizing selection. We distinguish real stabilizing selection and apparent stabilizing selection with more qualifying terms whenever possible. We now refer to most of the observations as “apparent stabilizing selection” unless we specifically refer to different models.

3) Perhaps most importantly for the dissemination of the work, the paper is not written in a manner accessible to people outside the field. As suggested by reviewer 3, the expectations need to be more clearly set up, and the methods described in greater detail.

We have carefully followed all related comments by the reviewers that suggest expanding the description of methods and explanation of results, including adding a schematic diagram to outline the experimental design (a new Figure 1). We believe these revisions throughout the text have greatly improved the manuscript and we appreciate the suggestions from the editor and the reviewers.

The reviewers also had a number of other suggestions to be taken into account; in particular, it would be helpful to discuss the possibility that MA lines have higher mutation rates than do natural populations. These follow.

All other comments have also been carefully considered. Revisions were either made accordingly, or detailed responses to the comments are given below.

Reviewer #1:

This paper does three things:

1) It estimates the spontaneous mutation rate in Drosophila from a collection of 25 sixty generation mutation accumulation lines;

2) Measures Phenotypes for several characters (gene expression and a half dozen or so quantitative traits);

3) Measure the SAME set of phenotypes on DGRP lines.

An estimate of the spontaneous mutation rate (at SNPs and small INDELs) is obtained. Furthermore the variation in measured quantitative traits in the MA lines relative to DSPR lines is too large to be consistent with those traits evolving neutrally.

In terms of estimating the mutation rate in a collection of MA lines. This experiment has been carried out in several other systems and even in D. mel. (as the authors point out; Keightley et al. 2009 and Schrider et al. 2013) – albeit the oft cited D. mel study of Keightley only examines 3 MA lines. Nonetheless, it does not appear that the estimate of mutation rate dramatically changes.

The second conclusion that quantitative traits (QTs) must be under stabilizing selection (because the ratio of natural variation to variation in the MA lines is too low) is not all that surprising. Studies have already concluded as much for gene expression traits by comparing standing variation to divergence (Rifkin et al., 2005 seems to be the go to citation for this). And of course it is not surprising that the quantitative traits are under stabilizing selection. Bristle are sort of a classic QT, that they are under stabilizing selection is discussed in the "textbook" of Falconer and Mackay. It is perhaps a novel observation that sleep and activity traits are under stabilizing selection, but not really surprising (thinking of humans sleep time has a mean of 8 hours and SD of 1 hour, it is easy to imagine sleeping 6 SD's more or less is highly deleterious).

Barring these earlier studies being flawed in some way, this paper doesn't really cover new ground. Overall this paper doesn't really make any novel and/or surprising discoveries, unless I am missing something.

That being said, a really valuable aspect of the paper is that everything is done better than previous studies. And the same MA lines used to measure mutations are used to measure phenotypes, and the same set of phenotypes is measured in the same way in the MA lines and DGRP lines. Furthermore the PI (Mackay) has an incredible track record of quality control, and putting the data out there so anyone can easily download and use it. This work seems like no exception. So what we have is a much better and cleaner dataset that is freely available and useful, and in my mind this is an important contribution.

We thank the reviewer for the enthusiasm for this study. We agree with the reviewer that this study indeed does not make surprising observations. Nonetheless, we believe studies like this that revisit old problems with newer technologies are immensely useful. As the reviewer points out, this study contains the most comprehensive and accurate estimates of several important genetic parameters and a useful large and open data set, which are important for further advances in this field. Please note, these are DGRP lines, not DSPR lines. The latter were derived from only eight inbred strains and do not capture the range of natural variation encompassed by the DGRP.

Reviewer #2:

In this paper from Trudy Mackay's group, a highly inbred line which is one of the DGRP inbreds of Drosophila derived as an iso-female line from an outbred population, is used as source from which 25 replicate mutation accumulation (MA) lines were derived and maintained for 60 generations of random mating with 10 parents of each sex. Observations on a number of quantitative traits including bristle number and gene expression and DNA sequences were obtained on all (or nearly all) MA lines. Thus the authors could estimate spontaneous mutation rates for variables ranging from molecular sequence to phenotypes of quantitative traits. The variances and distributions of traits among these lines were compared with those among the DGRP lines and differences used to infer the degree of selection operating in the new lines.

The whole idea seems to me a very nice one, and the scale of this experiment is enormous. The results, whatever they were, are bound to be of general interest and by their analyses and interpretation they add a lot more. I strongly recommend publication, after a relatively small amount of revision. I do, however, have a number of comments, the first being what I think is my main concern with the paper and could be argued is a fatal flaw.

1) A single line (DGRP 360) was used to generate the new MA lines. One significant observation from these lines was that, inter alia, the base mutation rates differed among the new MA lines (Figure 6c). This of course implies that comparisons between mutation rates for traits such as bristle number also presumably vary among lines. The evidence for and estimates of e.g. stabilising selection on the phenotypic traits, notably bristle number, were obtained by comparing the variance among the DGRP lines with that predicted from these mutation rates in the MA lines. As there is evidence of between line variance in the MA lines derived from a single DGRP line, there is every reason to expect that there would be differences in the mutation rates among the DGRP lines themselves. Hence the estimates of mutation rate and of stabilising selection effects in comparing the MA and DGRP lines is, presumably, sensitive to choice of DGRP line.

I think the authors need to consider what I think is a serious flaw in the design of the experiment and thus underestimation of the sampling variance and overestimation of significance levels of some of the results. A better design would, in some respects, have been to have taken a subset of MA lines from several (a sample of) DGRP lines to assess and allow for genetic variation in mutation rates among the DGRP lines. There are obviously other benefits in terms of comparisons in sequence etc. by just using one DGRP line, but the problem of lack of replication remains nevertheless. At the very least the authors need to discuss this: the flaw will be apparent to others also.

We thank the reviewer for raising this important point. We entirely agree with the reviewer that mutation rate and mutational variance can be genetically variable and thus specific to each line. We added discussion in the manuscript to acknowledge this limitation and its implications. We did create MA lines originating from multiple DGRP lines but decided to focus on MA lines derived from one progenitor line for the following reasons. The scale of the work is enormous. When deciding between a feasible design of a few ancestors each with a small number of lines and one ancestor with a large number of lines, we chose the latter, aiming for accurate estimates for one line instead of sub-optimal estimates of several lines. In retrospect, especially given the declining cost of sequencing, we could have analyzed more lines from multiple ancestors. Nevertheless, earlier studies using multiple MA lines derived from different ancestors did not suggest a large genetic variation in mutation rate (Keightley et al., 2009). Moreover, because we need to estimate mutational variance accurately for a large number of traits, we are more concerned with their accuracy than mutation rate per se. Perhaps most importantly, the level of the observed stabilizing selection was too large to be explained by genetic variation between lines. While we acknowledge the limitation in the experimental design, what we observed are likely to generalize and indeed are corroborated by earlier studies (also see references in the text).

Reviewer #3:

This is a very substantial study that measures accumulation of mutations in both sequence and quantitative traits in the same mutation accumulation lines, and compares them with standing variation in the original natural population. The analysis is fairly sophisticated (though not well explained), and the estimates are remarkably close to those from previous studies, and estimated by less direct methods.

Overall, this is a large-scale study that addresses an important issue: whether quantitative variation is maintained by a balance between mutation and selection. The results are not surprising, and support the consensus view, but that should not count against this work. On balance, I think that it should be suitable for eLife, but it does need some considerable revision – mostly, to explain what was done more clearly to a reader who has not been immersed in these issues.

We have extensively revised the manuscript, thanks to constructive comments from the editor and reviewers. One main revision was to explain better the experimental design and expectation of the experiments, as this reviewer suggests.

Generally, the paper is written for readers thoroughly familiar with these kinds of experiment. There is some explanation in the Discussion, but it needs to come much earlier. There needs to be a clearer statement early on about what is expected in mutation accumulation experiments. Highly deleterious mutations will not be seen, but weakly deleterious mutations will accumulate in the MA lines, and will be more frequent than in nature.

The estimates here do not include the fraction of strongly deleterious mutations. In particular, one expects recessive deleterious mutations to potentially rise to moderate frequency in these inbred lines. This would not necessarily show up in Figure 1 if the bulk of mutations are nearly neutral, but could be tested for by asking whether the MA lines can be made homozygous. A comment at least is needed on this issue.

This is a known limitation of MA lines, which must be free of lethal and highly deleterious mutations to keep them alive. Furthermore, beneficial mutations tend to fix at a higher probability thus biasing mutation rate estimates upwards in MA lines. These limitations have now been acknowledged both in the Introduction and in Discussion

A pervasive problem is the use of the term "stabilising selection" to refer to any kind of purifying selection. It would be clearer to restrict it to selection on a continuous trait towards an intermediate optimum, and it needs t be stead always whether this is selection on the observed or on hidden traits. In any case, it is simpler to see Vm/Vg as an estimate of net selection against the alleles that contribute trait variance, and avoid speculation about whether this acts via stabilising selection (in the sense defined here).

We have revised the text extensively to refer to the stabilizing selection indicated by Vm/Vg as apparent stabilizing selection and distinguish between stabilizing selection due to direct fitness effects for a trait or pleiotropic effects where possible.

Results, first paragraph: It would be interesting to compare the frequency of accumulated mutations with the spontaneous mutation rate, measured in other studies that compare parents with offspring – the difference being that the MA lines will not contain highly deleterious mutations.

This comparison would certainly be very interesting. However, it is not as simple as it appears and goes beyond the scope of this study. First, there has been only a limited number of de novo mutations detected in parent-offspring trio data in Drosophila (e.g. Keightley, et al., Genetics, 2014). The mutation rate estimated using a full-sib family agreed within the 95% confidence intervals with ours (Keightley, et al. 2014), suggesting that the mutation rate was not drastically higher in inbred lines. However, there were only eight mutations in total detected and confirmed in the Keightley (2014) study, which brings us to the second point. Even if there were such trio data, because of the small mutation rate, the number of mutations present in a single fly genome is small (on average ~1-2 mutations per fly). This requires an enormous number of flies to be sequenced to enable a meaningful comparison. This is precisely the point of doing mutation accumulation, which is to accumulate mutations in the MA lines so there characteristics can be properly studied.

A serious worry is whether mutation rate is higher, and different in kind, in inbred lines than in the wild population – cf Agrawal's recent work.

This is a problem that goes beyond the scope of the present study. We now note this limitation and its implications in the Discussion. The bottom line is, it is unlikely that this mutation rate bias in inbred lines alone is able to explain the large magnitude of strong apparent stabilizing selection.

Results,second paragraph: "Generally agreed well" is rather vague – one needs a statistical test of whether the rate and spectrum of mutations is the same as in previous work (as SI).

We changed the statement to include specific numbers from previous work. Regarding the test of the spectrum of mutations, we do not expect exact agreement between this work and earlier studies because different bioinformatic procedures can have significant effects on the inference of the mutation spectrum. Nevertheless, we detected 162 A/T->G/C, 91 A/T->C/G, 148 A/T->T/A, 433 G/C->A/T, 158 G/C->C/G, 211 G/C->T/A mutations while Keightley et al. (2009) detected 33, 20, 21, 53, 20, 27 (Table 5 of the reference) such mutations respectively, which had a Fisher’s exact test P value of 0.16. Thus our mutations and theirs do not have characteristic difference. However, we choose not to include this in the manuscript to avoid false expectation.

Subsection “Spontaneous mutation rate”: Given the population size, what is the fixation probability as a function of h and s? As it is, the reader has to work this out for herself.

We added to the manuscript a range of fixation probabilities given a range of. For simplicity, we assumed additivity so that a one dimensional range can be provided. It is important to note that, because of the small population size, for small to moderate s, the fixation probability is very close to the neutral probability, which is 1/2N.

In the same section, there should be a bit more explanation of how the mutation rate is estimated in the main text (a sentence, at least). This is not at all straightforward.

According to the reviewer’s suggestion, we added a brief description of the mutation rate estimation process in the main text.

Subsection “Rate of introduction of genetic variation by spontaneous mutations”: What is meant by "global gene expression"? Even after reading the Methods, I could not work this out. It seems bizarre to pool the expression level of all the genes, because, I assume, expression of individual genes will be analysed elsewhere. Still, it is not clear what "global gene expression" is measuring, especially since the expression measurements were made separately in the three experiments and so are not directly comparable.

We apologize for the confusing term. We mean genome-wide gene expression profiles. The “global” was meant to highlight the genome-wide characteristic of microarrays. To avoid confusion, we replace this term with “genome-wide gene expression profiles”.

Second paragraph of the same section: What are the units of measurement here? It is not obvious how to compare variance in gene expression, but the definition of the trait is not even stated in the main text. In the Methods, it emerges that there has been a very obscure normalisation procedure. Are we comparing the variance of absolute values? The variance on a log scale? The variance relative to Ve? This is crucial to how we interpret these variances.

The measurement unit is the intensity of hybridization followed by logarithm transformation to make the distribution more or less normal. This has been the standard practice in obtaining expression levels from microarrays.

Regarding the questions the reviewer has:

1) The variance is not on absolute values; 2) the expression level is on a log scale, the variance is the variance of the expression level; 3) we scale Vm by Ve to obtain mutational heritability hm2. All other comparisons are directly on and can only be on Vmor Vgbecause the two experiments have different Ve.

Subsection “Strong stabilizing selection on quantitative trait variation”: Seeing lower standing variance than expected under neutrality does not imply stabilising selection; there could instead be purifying selection against alleles that have pleiotropic effects on the measured traits. The term "stabilising selection" needs to be defined carefully.

We appreciate this comment. To clarify things, we have now make explicit distinctions between real stabilizing selection and apparent stabilizing selection that may be the result of pleiotropic effects.

In the same section, the comparison of E(am2) vs E(ag2) is confusing. The implicit assumption here is that stabilising selection acts on the observed traits, which is wildly implausible. Rather, Vm/Vg gives an estimate of the net selection on each allele. This selection may be due to stabilising selection on unobserved traits, and/or to direct selection against the alleles. (The distinction here depends of course on what "traits" one defines). It is confusing to use the term "stabilising selection" here.

We now refer to the observed variance difference as apparent stabilizing selection. Nevertheless, we still find the comparison of E(ag2) and E(am2) useful because they provide a connection between the DNA sequences and expression profiles in this study, which has not been possible before. It is perhaps best to consider these allelic effects as the “apparent” or “net” (as the reviewer used) effects, which for pleiotropic effects, are simply the indirect effects of the fitness changing DNA variants on quantitative traits. Regardless of the nature of the stabilizing selection, the outcome is that the effects (whether they are direct or indirect) they have on the traits are on average smaller for standing variation than spontaneous mutations, which we think is an important result. To avoid confusion, we have revised the relevant text.

The basic theory is explained in the Discussion but needs to come much earlier to avoid confusion.

We have revised the text extensively to provide better introduction to the experimental design (e.g. Figure 1) and explanation of results.

Discussion, paragraph four: The estimate of selection from Vm/Vg is an order of magnitude smaller than estimates of selection against deleterious mutations generally. However, this is consistent with a broad distribution of effects on fitness and on traits: the alleles that contribute to genetic variance will tend to be less deleterious than spontaneous mutations. Also, of course, balancing selection may contribute genetic variance.

We agree with the reviewer. These points were in fact made in the text that came a paragraph later, where we offered alternative explanations.

Subsection “Estimation of mutation rate in MA lines”: Does the actual frequency distribution of (presumed) neutral sites actually fit the simulation model? Figure 6 suggests yes, but this is not commented on in the text.

We have added this description to the text.

It seems rather inefficient to just record the number of mutations at p>0.2 – surely using the actual frequency would be more informative? One would also like to see whether non-synonymous mutations are less likely to reach high frequency, as expected if they are recessive deleterious.

We wish we could reliably detect mutations of lower frequency. The sequencing was done on a pool of DNA and was not without error. Choosing a higher frequency cut-off allows us to more accurately identify mutations and estimate mutation rates.

Subsection “Variance in quantitative traits due to mutations, standing variation, and environmental variation”: This formula for VMA makes no sense; it will be negative for large times. There is no explanation of where it comes from.

We are not entirely sure why this formula would lead to negative values for large times. For many numerical examples we tried, k is closest to zero for large Ne and small t, in which the second term in the bracket is a very small number, making k positive. This formula is taken from the reference Mackay et al. 1992, which was based on formula in Lynch and Hill, 1986. These two papers were cited at the end of the sentence introducing this formula.

It is annoying that the figures are all given at the end, but are separated from their captions, and are not numbered.

We agree. This has to do with the journal’s submission system, which asks for separate files for figures. The PDF file is then generated automatically by concatenating figures to the end of the text.

Figures 7 and 9 are especially inscrutable.

We apologize for the crowded text and bars. We have enlarged the figures to better show the bars and texts. We hope the figures are more readable now.

[Editors' note: further revisions were requested prior to acceptance, as described below.]

Reviewer #3:

The revision more or less answers my criticisms, though the changes are minimal. I accept that starting from a single line was a reasonable choice, given the effort involved, and this is now explained better. However, I am still not satisfied on a couple of points:

The revisions have been mostly to better explain the experimental design. However, my original criticism was more that the underlying theory is not set out well at the beginning: this has not been addressed.

A specific problem is with a key interpretation of the results: in paragraph seven of the Discussion it's stated that the estimated selection is too weak to be consistent with classical estimates of selection against alleles that influence viability. I don't think this necessarily follows. If there is a distribution of effects on traits, then the ratio Vm/Vg will be smaller for traits that are less closely connected with fitness. With rare alleles, and assuming mutation/selection balance, the selection estimated from Vm/Vg equals the harmonic mean s, weighted by the squared affect on traits. Thus, of the effect on the trait is not correlated with fitness, we have 1/<1/s>, whereas if the trait is fitness itself, we have <s^2>/<s>, which must be larger. So, the differences in Vm/Vg between viability and gene expression are surely to be expected?

These issues are discussed in Charlesworth's recent PNAS review, which should be not just cited, but used to frame the interpretation of these data.

A point of terminology: I don't think that "apparent stabilising selection" is used correctly. To me, this refers to the reduction in fitness of extreme phenotypes that is observed even for a neutral trait, when trait variation is due to pleiotropic effects of mutations. However, here "apparent SS" refers to the ratio Vm/Vg under this pleiotropic model. I think that "purifying" or "pleiotropic" selection would be a better term to use.

We thank this reviewer for the comments on the interpretation and presentation of data. We respectively disagree with some of these points. Specifically:

1) Regarding how to frame the interpretation of the data, we frame this work as more of an empirical work rather than theoretical. To this end and following the reviewers’ comments in the previous revision, we have provided detailed information on the rationale and expectation of the experimental outcomes. We believe this is of sufficiently clarity. While we are aware of the discrepancies between the data and theories and the fact that there have been presently no consensus theory to reconcile the two, we discuss the implications at the end of the manuscript in length, hoping that the ongoing debate won’t distract the main experimental results.

2) For the difference between Vm/Vg for gene expression and fitness, we agree with the reviewer to the extent that some difference is expected because not all gene expression traits are directly correlated with fitness. There is a distribution of effects and by using the median as the numerical example, we are making the point that at least 50% of the genes are experiencing selection that’s weaker than selection on fitness. We are not entirely sure why this causes an issue.

3) Finally, we believe the term “apparent stabilizing selection” faithfully reflects the nature of the data, i.e., the selection apparently constrains variance but the form is unknown. Purifying or pleiotropic selection are too specific terms to be supported by our data.

4) Charlesworth’s excellent review was specifically for effects of mutations on fitness and its components, not the quantitative traits examined in this manuscript. The point that the strength of selection implicated by the empirical estimate of Vm/Vg is too small by an order of magnitude to be consistent with mutation-apparent (pleiotropic) stabilizing selection has been made before (e.g. Barton 1990) and is generally accepted, so we do not understand the reviewer’s objection to this interpretation.

https://doi.org/10.7554/eLife.14625.023

Article and author information

Author details

  1. Wen Huang

    1. Program in Genetics, North Carolina State University, Raleigh, United States
    2. W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, United States
    3. Department of Biological Sciences, North Carolina State University, Raleigh, United States
    Contribution
    WH, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Contributed equally with
    Richard F Lyman
    Competing interests
    The authors declare that no competing interests exist.
  2. Richard F Lyman

    1. Program in Genetics, North Carolina State University, Raleigh, United States
    2. W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, United States
    3. Department of Biological Sciences, North Carolina State University, Raleigh, United States
    Contribution
    RFL, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Contributed equally with
    Wen Huang
    Competing interests
    The authors declare that no competing interests exist.
  3. Rachel A Lyman

    1. Program in Genetics, North Carolina State University, Raleigh, United States
    2. W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, United States
    3. Department of Biology, Knox College, Galesburg, United States
    Present address
    Department of Biology, Washington University in St. Louis, St. Louis, United States
    Contribution
    RAL, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  4. Mary Anna Carbone

    1. Program in Genetics, North Carolina State University, Raleigh, United States
    2. W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, United States
    3. Department of Biological Sciences, North Carolina State University, Raleigh, United States
    Contribution
    MAC, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  5. Susan T Harbison

    1. Program in Genetics, North Carolina State University, Raleigh, United States
    2. W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, United States
    Present address
    Laboratory of Systems Genetics, National Heart Lung and Blood Institute, Bethesda, United States
    Contribution
    STH, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  6. Michael M Magwire

    1. Program in Genetics, North Carolina State University, Raleigh, United States
    2. W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, United States
    3. Department of Biological Sciences, North Carolina State University, Raleigh, United States
    Present address
    Syngenta, Research Triangle Park, United States
    Contribution
    MMM, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  7. Trudy FC Mackay

    1. Program in Genetics, North Carolina State University, Raleigh, United States
    2. W. M. Keck Center for Behavioral Biology, North Carolina State University, Raleigh, United States
    3. Department of Biological Sciences, North Carolina State University, Raleigh, United States
    Contribution
    TFCM, Conception and design, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    trudy_mackay@ncsu.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon 0000-0002-2312-7245

Funding

National Institute of General Medical Sciences (R01 GM45146)

  • Trudy FC Mackay

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Molly Przeworski, Reviewing Editor, Columbia University, United States

Publication history

  1. Received: January 21, 2016
  2. Accepted: May 21, 2016
  3. Accepted Manuscript published: May 23, 2016 (version 1)
  4. Version of Record published: June 16, 2016 (version 2)
  5. Version of Record updated: October 19, 2016 (version 3)

Copyright

This is an open-access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.

Metrics

  • 1,839
    Page views
  • 486
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Scopus, PubMed Central, Crossref.

Comments

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Cell Biology
    Yi Liu et al.
    Research Article Updated