1. Ecology
  2. Evolutionary Biology
Download icon

Broad geographic sampling reveals the shared basis and environmental correlates of seasonal adaptation in Drosophila

  1. Heather E Machado  Is a corresponding author
  2. Alan O Bergland  Is a corresponding author
  3. Ryan Taylor
  4. Susanne Tilk
  5. Emily Behrman
  6. Kelly Dyer
  7. Daniel K Fabian
  8. Thomas Flatt
  9. Josefa González
  10. Talia L Karasov
  11. Bernard Kim
  12. Iryna Kozeretska
  13. Brian P Lazzaro
  14. Thomas JS Merritt
  15. John E Pool
  16. Katherine O'Brien
  17. Subhash Rajpurohit
  18. Paula R Roy
  19. Stephen W Schaeffer
  20. Svitlana Serga
  21. Paul Schmidt  Is a corresponding author
  22. Dmitri A Petrov  Is a corresponding author
  1. Department of Biology, Stanford University, United States
  2. Wellcome Sanger Institute, United Kingdom
  3. Department of Biology, University of Virginia, United States
  4. Department of Biology, University of Pennsylvania, United States
  5. Department of Genetics, University of Georgia, United States
  6. Institute of Population Genetics, Vetmeduni Vienna, Austria
  7. Centre for Pathogen Evolution, Department of Zoology, University of Cambridge, United Kingdom
  8. Department of Biology, University of Fribourg, Switzerland
  9. Institute of Evolutionary Biology, CSIC- Universitat Pompeu Fabra, Spain
  10. Department of Biology, University of Utah, United States
  11. Taras Shevchenko National University of Kyiv, Ukraine
  12. National Antarctic Scientific Centre of Ukraine, Taras Shevchenko Blvd., Ukraine
  13. Department of Entomology, Cornell University, United States
  14. Department of Chemistry & Biochemistry, Laurentian University, Canada
  15. Laboratory of Genetics, University of Wisconsin-Madison, United States
  16. Department of Ecology and Evolutionary Biology, University of Kansas, United States
  17. Department of Biology, The Pennsylvania State University, United States
Research Article
  • Cited 3
  • Views 2,076
  • Annotations
Cite this article as: eLife 2021;10:e67577 doi: 10.7554/eLife.67577

Abstract

To advance our understanding of adaptation to temporally varying selection pressures, we identified signatures of seasonal adaptation occurring in parallel among Drosophila melanogaster populations. Specifically, we estimated allele frequencies genome-wide from flies sampled early and late in the growing season from 20 widely dispersed populations. We identified parallel seasonal allele frequency shifts across North America and Europe, demonstrating that seasonal adaptation is a general phenomenon of temperate fly populations. Seasonally fluctuating polymorphisms are enriched in large chromosomal inversions, and we find a broad concordance between seasonal and spatial allele frequency change. The direction of allele frequency change at seasonally variable polymorphisms can be predicted by weather conditions in the weeks prior to sampling, linking the environment and the genomic response to selection. Our results suggest that fluctuating selection is an important evolutionary force affecting patterns of genetic variation in Drosophila.

Introduction

Fluctuations in the environment are an inescapable condition for all organisms. While many of these fluctuations are entirely unpredictable, some are predictable to a degree, including those that occur on diurnal and seasonal time scales. The predictability of cyclic fluctuations is reflected in the fact that many species exhibit plastic physiological and behavioral strategies that enable them to survive the unfavorable season and exploit the favorable one (Denlinger, 2002; Kostál, 2006); such plastic responses represent the classical form of seasonal adaptation (Tauber et al., 1986). However, seasonally varying selection can – in principle – maintain fitness-related genetic variation if some genotypes have high fitness in one season but not another (Gillespie, 1973; Haldane and Jayakar, 1963). Thus, in organisms that undergo multiple generations per year, a distinct form of seasonal adaptation occurs when the frequency of alternate genotypes changes in response to seasonal fluctuations in the environment.

Seasonal adaptation can be seen as a form of local adaptation – local adaptation in time to temporally varying selection pressures rather than in space. Such adaptation in time has often been considered to be uncommon and, when present, unlikely to result in long-term balancing selection (Ewing, 1979; Hedrick, 1976). Classic quantitative genetic theory suggests that an optimal, plastic genotype will eventually dominate a population that is exposed to periodically changing environments (Bürger and Gimelfarb, 1999; Korol et al., 1996; Lande, 2008; Scheiner, 1993). This is particularly so when certain environmental cues are reliable indicators of changes in selection pressure (Levins, 1974; Via and Lande, 1985). Predictions from traditional population genetic models suggest that periodically changing environments will lead to the rapid loss of seasonally favored ecotypes as slight changes in selection pressure from one year to another eventually push allele frequencies at causal alleles to fixation (Hedrick, 1976).

Recent theoretical models have called these classical predictions into question. For instance, a population genetic model by Wittmann et al., 2017 has demonstrated that seasonally varying selection can indeed maintain fitness-related genetic variation at many loci throughout the genome provided that dominance shifts from season to season in such a way that, on average, the seasonally favored allele remains slightly dominant (Curtsinger et al., 1994). This model, along with others that highlight the importance of population cycles (Bertram and Masel, 2019), as well as overlapping generations and age structure (Bertram and Masel, 2019; Ellner, 1996; Ellner and Hairston, 1994; Ellner and Sasaki, 1996), suggest that seasonal adaptation and adaptive tracking (Kain et al., 2015) could be an important feature of organisms such as Drosophila that have multiple generations per year (Behrman et al., 2015, but see Botero et al., 2015). More generally, it is possible that adaptive tracking of environmental fluctuations on rapid time scales might be more common than generally acknowledged and has been hidden from us due to the difficulty of detecting such adaptive tracking reliably (Lynch and Ho, 2020) and from the lack of finely resolved temporal data (Buffalo and Coop, 2020).

Despite the lack of theoretical consensus on whether and how seasonal adaptation operates, there is substantial empirical evidence for seasonal adaptation in many organisms including Drosophila. Seasonal adaptation was first observed in D. pseudoobscura by Dobzhansky and colleagues (e.g., Dobzhansky, 1943) by tracking frequencies of inversion polymorphisms over seasons. Later studies confirmed and extended these early findings to other species including D. melanogaster (Kapun et al., 2016; Stalker, 1980; Stalker, 1976) and D. subobscura (Rodríguez-Trelles et al., 2013).

In D. melanogaster, multiple additional lines of evidence from phenotypic and genetic analyses demonstrate the presence of seasonal adaptation. When reared in a common laboratory environment, flies collected in winter or spring (close descendants of flies that successfully overwintered) show higher stress tolerance (Behrman et al., 2015), greater propensity to enter reproductive dormancy (Schmidt and Conde, 2006), increased innate immune function (Behrman et al., 2018), and modulated cuticular hydrocarbon profiles (Rajpurohit et al., 2017) as compared to flies collected in the fall (the descendants of those flies who prospered during the summer). Rapid adaptation over seasonal time scales in these and related phenotypes has also been observed in laboratory (Schmidt and Conde, 2006) and field-based mesocosm experiments (Erickson et al., 2020Grainger et al., 2021; Rajpurohit et al., 2018; Rajpurohit et al., 2017; Rudman et al., 2019Rudman et al., 2021). Genome-wide analysis indicated that a large number of common polymorphisms change in frequency over seasonal time scales in one mid-latitude orchard (Bergland et al., 2014), and allele frequency change among seasons has been observed using candidate gene approaches (Cogni et al., 2014). In several cases, these adaptively oscillating polymorphisms have been linked to seasonally varying phenotypes (Behrman et al., 2018; Erickson et al., 2020; Paaby et al., 2014).

Despite ample evidence of seasonal adaptation in D. melanogaster, many aspects of this system remain unexplored. First, we do not know whether seasonal adaptation is a general feature of D. melanogaster populations across its range. Previous work (Schmidt and Conde, 2006; Bergland et al., 2014; Cogni et al., 2014; Behrman et al., 2015; Behrman et al., 2018) detected seasonal allele frequency fluctuations in a single locality over the span of 1–3 years. These results need to be confirmed at other locations. Second, as many ecological and environmental variables covary with season, it is unclear which specific factors drive the observed changes in phenotype and genotype in populations sampled over seasonal time. One straightforward and statistically powerful way to start answering these questions is to assess the extent of parallel shifts in allele frequencies over seasonal time in many populations across the D. melanogaster range. Should such parallel shifts be detected we can then infer that seasonal adaptation is widespread and at least partly driven by a common set of variants. We can also attempt to associate the magnitude and direction of allelic shifts with local environmental conditions (e.g., temperature) that may vary among populations sampled at equivalent points in seasonal time. Broad sampling will also allow us to determine the magnitude of the shared seasonal fluctuations and give us a glimpse into the genetic architecture of seasonal adaptation: for instance, whether or not the seasonally responsive variants are found genome-wide or clustered into linked blocks or even supergenes.

Here we carry out such a study by using allele frequency data from 20 paired seasonal samples from 15 localities in North America and Europe. First, we demonstrate that seasonal adaptation is a general phenomenon that occurs in multiple populations of D. melanogaster on two continents by providing evidence that at least some of the same polymorphisms cycle between seasons in populations sampled across vast geographic distances. Seasonal alleles tend to show clinal variation, with the alleles that increase in frequency through the summer generally being more frequent in lower latitude (more ‘summer-like’) locations. We also show that seasonal signal is enriched near inversion breakpoints or within inverted regions on chromosome 2L, 3L, and 3R, although these inverted regions do not account for all of the signal that we observe. Furthermore, we show that allele frequency change between seasons is most predictable when taking into account the local temperature prior to sample collections both in the spring and in the fall, which hints at the complex and nonlinear nature of adaptation with adaptive tracking occurring at sub-seasonal timescales. Taken together, our work demonstrates that seasonal adaptation is a general and predictable feature of D. melanogaster populations and has pervasive effects on patterns of allele frequencies genome-wide. More generally, we establish that metazoan populations can exhibit adaptive tracking of environmental conditions on extremely short time scales, most likely shorter than a single growing season encompassing ~10 generations.

Results

Fly samples and sequence data

We assembled 73 samples of D. melanogaster collected from 23 localities in North America and Europe (Supplementary file 1, Figure 1—figure supplement 1). For 15 sampling localities, flies were collected in the spring and fall over the course of 1–6 years (Figure 1A). Our sampling and sequencing efforts also complement other genome-wide datasets that investigate genetic variation within and among D. melanogaster populations throughout their range (Campo et al., 2013; Fabian et al., 2012; Grenier et al., 2015; Kao et al., 2015; Kapun et al., 2020; Kim et al., 2014; Kolaczkowski et al., 2011; Lack et al., 2016; Mackay et al., 2012; Pool et al., 2012; Reinhardt et al., 2014; Svetec et al., 2016; Zhao and Begun, 2017).

Figure 1 with 2 supplements see all
Sampling times, localities, and basic population structure of samples used in this study.

(A) Distribution of collection times during the spring (red) and fall (blue) in relation to latitude. For samples where the collection month, but not day, was recorded, we specified the 15th of the month to calculate Julian Day. (B) Sampling localities for the primary seasonal analysis (‘Core20’: green) and the latitudinal cline analysis (‘Clinal’: red), distributed across North America and Europe. Numbers represent the number of years of spring/fall comparisons at that locality. The dotted circle shows the one locality (PA_li) with both Core20 and Clinal samples. (C) Principal component analysis of SNP allele frequencies. Circles are placed around samples from three geographic regions: Europe (EU), California (CA), and Eastern North America (East NA); points are colored by latitude and shapes represent seasons, as defined by collection time.

For our analysis, we divided our samples into two subsets (Figure 1B). The first subset (‘Core20’) is composed of 20 pairs of spring and fall samples from the same location. Samples in the Core20 set are drawn from 15 localities in North America and Europe and we use at most 2 years of sampling from any one locality. The second subset of samples (‘Clinal’) was used to examine patterns of clinal variation along the East Coast of North America and consists of four populations sampled in the spring (see Materials and methods and Supplementary file 1).

To estimate allele frequencies genome-wide, we performed pooled DNA sequencing of multiple individual flies per population (Schlötterer et al., 2014; Zhu et al., 2012). For each sample, we resequenced pools of ~75 individuals (range 27–164) to an average of 94× coverage (range 22–220, Supplementary file 1). Analyses presented here use a total of 43 samples in the Core20 and Clinal sets described above. Data from the remaining 30 samples, including those that do not constitute paired spring/fall samples, are not included in our analyses here but nonetheless form a part of our larger community-based sampling and resequencing effort and are deposited with the rest of the data (NCBI SRA; BioProject Accession #PRJNA308584; accession numbers for each sample can be found in Supplementary file 1).

Pooled resequencing identified ~1.75M SNPs following quality control and filtering for read depth and average minor allele frequency (minor allele frequency > 0.01). We applied a second filtering and retained ~775,000 SNPs that have observed polymorphism in each population sampled. This biases us toward assessing fluctuations at higher frequency polymorphisms (Figure 1—figure supplement 2).

To gain insight into basic patterns of differentiation among the sampled populations, we performed a principal component (PC) analysis across all samples. Samples cluster by locality and geography (Figure 1C) with PC one separating European, North American West Coast, and East Coast populations, while PC two separates the eastern North America samples by latitude (linear regression p=3*10−15, R2 = 0.84). No PC is associated with season, following correction for multiple testing, implying that the spatial differentiation dominates patterns of SNP variation across these samples.

Seasonal adaptation is a general feature of D. melanogaster populations

Our first objective was to determine whether or not we find evidence of parallel seasonal cycling of allele frequencies across multiple populations. To identify a signal of parallel allele frequency change over time we assessed the per-SNP change in allele frequency between spring and fall among the Core20 set of populations using three complementary methods that rely on different statistical models. For all three methods, we contrast the observed signal to permutations in which we shuffled the seasonal labels within each spring–fall paired sample. With the exception of the seasonal labels within paired samples, these permutations have the advantage of retaining all of the features of the data including patterns of genetic covariation and all of the noise in the allele frequency estimation.

We first used a generalized linear model (GLM), regressing allele frequency at each SNP on season across all of the populations. We find that the observed genome-wide distribution of ‘seasonal’ p-values from this model is not uniform and there is a pronounced excess of SNPs with low p-values (Figure 2A). The observed data have a stronger seasonal signal than most permutations. For example, at a p-value of ~4*10−3, which corresponds to the top 1% (Supplementary file 1B), the observed data have a greater number of seasonal SNPs than 96.8% of the permutations (Figure 2B). This enrichment is robust to the choice of quantile (Figure 2—figure supplement 1). Seasonal SNPs identified by Bergland et al., 2014 are slightly enriched among the top 1% of SNPs identified using the GLM, relative to matched genomic controls, after excluding the overlapping Pennsylvanian populations (log2 odds ratio ± SD = 0.59 ± 0.37, pperm = 0.0512). Our top set of seasonally variable SNPs shift in frequency by ~4–8% (Figure 2C). Taken at face value, this corresponds to effective selection coefficients of ~10–30% per season (~1–3% per generation assuming 10 generations per season).

Figure 2 with 4 supplements see all
Signals of seasonal adaptation.

(A) p-value distribution of GLM seasonal regression (red line), permutations (solid black lines, 50/500 plotted), and expected values (dashed black line). (B) Comparison of seasonal test statistics (quantile = 0.01) for observed (red) and permuted (black) datasets, for the three analyses: i: GLM (−log p-value), ii: Bayenv (Bayes factor; outlier of a Bayes factor of 10 excluded from plotting), iii: RFM (X2). (C) Average spring/fall allele frequency change for each of the top 1% of seasonally varying SNPs (red) and non-seasonal, matched control SNPs (black), as a function of the folded spring allele frequency. Lines represent a moving average across SNPs with a given spring allele frequency. (D) Enrichment of seasonal SNPs (median of 500 permutations). Enrichment is calculated as the observed number of seasonal SNPs over the number of seasonal SNPs in each permutation at p<0.004. The genome is subset by chromosome and location relative to major cosmopolitan inversions (bkpts.: inversion breakpoints ± 1 Mb, inside: interior to the breakpoints excluding 1 Mb buffer, outside: exterior to the breakpoints excluding the 1 Mb buffers). Error bars are 95% confidence intervals and asterisks denote greater seasonal enrichment than more than 95% of permutations. (EF) Concordance rates of seasonal and clinal polymorphisms by (E) geographic region and (F) genomic region. The y-axis represents the concordance rate of allele frequency change across time and space, assuming that the winter favored allele is the same as the allele favored in high latitudes. Clinal polymorphisms were identified along the East Coast of North America. Seasonal sites were identified using all populations not used in the clinal analysis (n = 18). (E) Seasonal sites were also identified using exclusively the California populations (purple: n = 3) or the Europe populations (green: n = 3). Shaded areas are the 95% confidence intervals of 100 matched control datasets for the Californian (purple) and European (green) analyses. (F) Concordance by genomic region. Error bars are 95% confidence intervals (binomial error) and asterisks denote concordance significantly greater than 0.5 (binomial test p<0.05).

Next, we performed a parallel analysis using a Bayesian model, Bayenv, which uses a population covariance matrix to account for structure among samples (Günther and Coop, 2013). We found an enrichment of seasonal SNPs at high test statistic values (quantile = 0.01: Bayes factor = 2.6), with a greater enrichment of seasonal SNPs compared with 94% of permutations (Figure 2B). The set of seasonal SNPs identified in the Bayenv analysis shows substantial overlap with the GLM set. Specifically, for the top 1% of SNPs in the GLM analysis, 20% are also in the top 1% of the Bayenv analysis, compared with a naive expectation of 1% (Figure 2—figure supplement 2).

Finally, we developed a method derived from combining ranked-normalized p-values from Fisher’s exact tests that compares allele frequencies within each pair of seasonal samples (see Materials and methods, Figure 2—figure supplement 3). We again find a deviation from the expected test statistic distribution indicating an excess of seasonally varying polymorphisms (Figure 2B) with the observed enrichment of high scores (quantile = 0.01: X2 = 32) to be greater than more than 98% of permutations. These three analyses, using entirely distinct statistical machinery and assumptions, together provide strong evidence of the genome-wide excess of seasonal SNPs indicating a robust signal of parallel seasonal adaptation across multiple populations.

We examined whether seasonally variable SNPs are enriched on any particular chromosome. To perform this test, we calculated the enrichment of seasonal SNPs per chromosome in the observed data relative to the permutations (Figure 2D). Chromosomes 2L and 3R have the greatest nominal enrichment of seasonal SNPs relative to permutations, with a mean increase of 30% (pperm = 0.06) and of 18% (pperm = 0.08), respectively. Enrichment is much lower for chromosomes 2R and 3L (−2 and 4%, pperm = 0.54 and pperm = 0.20, respectively). The variation of the seasonal signal across chromosomes is statistically significant (ANOVA, p<10−15).

As chromosomal inversions have been observed to fluctuate in frequency over seasonal time (Dobzhansky, 1948; Dobzhansky, 1943; Kapun et al., 2016), we extended the analysis to focus on large cosmopolitan inversions. We find that regions near inversion breakpoints (±0.5 Mb) on chromosomes 2L (85% enrichment, pperm = 0.03, inversion In(2L)t) and 3L (30% enrichment, pperm = 0.01, inversion In(3L)Payne) are strongly enriched for seasonal SNPs. Regions within the three inversions on chromosome 3R (In(3R)Mo, In(3R)K, In(3R)Payne) are also enriched for seasonal SNPs (26% enrichment, pperm = 0.05). We do not find evidence for enrichment of seasonal SNPs associated with In(2R)NS as previously reported by Kapun et al., 2016. The regions outside inversions (also excluding the 0.5 Mb flanking the inversions) represent the most centromeric/telomeric SNPs (36% of total SNPs). In these regions, the seasonal signal is weaker, with a 2.5% enrichment of seasonal SNPs (pperm = 0.31). These data confirm that inversions may play a role in the adaptive response to seasonally variable environments, similar to that of structural variants or large haplotype blocks in other systems (Hager et al., 2021; Jones et al., 2012; Samuk et al., 2017; Tuttle et al., 2016).

Latitudinal differentiation parallels signatures of seasonal adaptation

Parallel changes in phenotype and genotype across time and space have been observed in D. melanogaster (Bergland et al., 2014; Cogni et al., 2015; Cogni et al., 2014; Kapun et al., 2016; Kapun and Flatt, 2019; Paaby et al., 2014; Rajpurohit et al., 2018; Rajpurohit et al., 2017). To assess whether we observe similar patterns in our data, we tested whether SNPs that change in allele frequency along a latitudinal cline and across seasons do so in a parallel fashion. Parallel changes in allele frequency occur when the sign of the allele frequency change between spring and fall is the same as between high- and low-latitude populations. Using four populations along the East Coast latitudinal cline, we identified clinally varying polymorphisms using single-locus models (GLM, with allele frequency regressed against latitude) as well as a model that accounts for population structure (Bayenv; Günther and Coop, 2013). We re-identified seasonally varying SNPs using a GLM (as done previously) using 18 of the Core20 populations to avoid two samples drawn from a shared collection locale with the clinal set (PA_li; Supplementary file 1).

We find significant seasonal and latitudinal parallelism, with the magnitude of parallelism increasing with increasing GLM significance thresholds (Figure 2E). Among the polymorphisms that are in the top 1% of both seasonal and clinal GLMs (n = 115), 74% have parallel allele frequency change (greater than 100/100 matched controls; naive expectation: 50%). This pattern of parallelism is also observed when using a Bayenv model of clinality (Figure 2—figure supplement 4). Parallel changes in allele frequency between seasons and clines that we observe here is similar to previously published genome-wide (Bergland et al., 2014) and locus-specific results (Cogni et al., 2014; Kapun et al., 2016; Stalker, 1980; Stalker, 1976, p. 197).

To evaluate whether parallel allele frequency change across time and space are likely to be independent of shared demographic processes such as local seasonal migration, we compared East Coast clinal variation to seasonal variation identified in two groups of populations that are geographically and genetically distant from the East Coast (Campo et al., 2013 and see Figure 1B) and that are unlikely to be connected via common seasonal migration. We assessed seasonal changes in allele frequency separately for Californian (n = 3) and European (n = 3) populations and tested for parallelism with polymorphisms that vary along the East Coast of North America. We found a clear signal of parallelism between seasonal and latitudinal variation in both the California and the Europe comparisons: for the top 1% of seasonal and clinal polymorphism, 70% and 63% have parallel allele frequency change, for California and Europe, respectively (greater than 100/100 and 99/100 matched controls; Figure 2E).

As SNPs in or near a number of major inversions are enriched for seasonal variation (Figure 2D), we tested whether there is increased seasonal and latitudinal parallelism in these regions. For the top 5% of clinal and seasonal SNPs (n = 2132), parallel allele frequency change is highest near inversion breakpoints on chromosomes 2L and 3R (92%, CI95: 87–96%, and 86%, CI95: 78–93%, respectively) and inside the 2L inversion (83%, CI95: 80–87%; Figure 2F). While parallelism is greatest overall around inversion breakpoints (concordance: 81%, CI95: 77–84%) and inside inversions (concordance: 70%, CI95: 67–73%), we also observe a statistically significant excess of latitudinal and seasonal parallelism outside of inversions (concordance: 57%, CI95: 53–62%). The elevated signal of parallelism between spatial and seasonal allele frequency fluctuations across the whole genome, and in particular outside of inverted regions, suggests that inversions themselves, while important, are not the sole drivers of allele frequency change between seasons.

Predictability of seasonal adaptation based on local environmental conditions

The signature of seasonal adaptation that we observe indicates that there are consistent changes in allele frequencies between seasons, broadly defined, among populations sampled across multiple years, and localities separated by thousands of miles (Figure 1A,B). Next, we asked if seasonal fluctuations in allele frequencies are predictable within any particular population or year and if the extent of predictability is related to aspects of local climate or other factors.

To address this question, we performed a leave-one-out analysis. In this analysis, we sequentially removed each of the 20 paired spring–fall populations within the Core20 set and re-identified seasonally variable SNPs among the remaining 19 populations as well as for the dropped, 20th population. To evaluate predictability, we calculated the proportion of SNPs that have the same direction of allele frequency change between spring and fall in both the discovery and test models (i.e., concordant allele frequency change). We estimated concordance across the range of joint significance thresholds of both models. For each population, we then calculated a ‘predictability score’. This score reflects the change in concordance rate as a function of the joint significance threshold (see Materials and methods for details). Note that populations with a positive predictability score are those in which the concordance rate is greater than 50% over the bulk of the genome or chromosome.

We find that the majority of populations show a positive predictability score (Figure 3A). For these populations, the concordance rate reaches 60–70% among SNPs passing a stringent joint significance threshold (top 0.1% in both models). At more modest joint significance thresholds (e.g., the top 10% in both sets), the concordance rate is lower but still statistically different from 50% (maximum nominal p-value at the top 5% of sites is 0.008). At the same time, intriguingly, we found that four of the 20 populations had negative predictability scores (Benton Harbor, Michigan 2014; Lancaster, Massachusetts 2012; Topeka, Kansas 2014; and Esparto, California 2012). Strong allele frequency changes in these four populations occur, but the direction of allele frequency change from spring to fall at many SNPs is opposite that of the remaining populations.

The thermal limit model.

The thermal limit model argues that weather in the weeks prior to sampling can explain variation in the predictability of allele frequency change. (A) Each line represents the comparison between one population and the remaining 19. The x-axis represents the upper threshold of quantile-ranked p-values from the single-population test (Fisher’s exact test) and the 19-population test (GLM), e.g., the top 1% in both tests. The y-axis is the fraction of SNPs where the sign of allele frequency change in the single population test matches the average sign change among the remaining 19. The color scheme represents the slope of this line and is used as a summary statistic for each population. (BC). We regressed the summary score from (A) onto a number of characterizations of average temperature (B, first four rows), geography (B, fifth row), technical (B, sixth to 8th rows), and thermal extremes (C), considering weather 14. Diamonds represent the observed R2 for (B) and observed maximum R2 across all thermal limits for (C). Violin plots represent the expected distribution of R2 based on permutations. The red diamond represents the model with nominal p-value<0.01. The empirical p-values for these models are listed next to the corresponding red diamond. The 14 day model that uses the counts of hot spring days and cold fall days has a false discovery rate of 17% based on multiple testing correction across all environmental models. (D) The distribution of spring maximum (S-Max) daily temperature and fall minimum (F-min) daily temperature in the 2 weeks prior to sampling. Discordant (blue) populations do not cluster in time or space. Populations shown here are those in which we have weather data (15 populations, in total). (E) The stand-out model uses the number of hot spring days and cold fall days. To determine the optimal threshold for what defines hot and cold, we systematically varied the upper and lower thermal limits from 0°C to 40°C and used the count of hot spring (x1) and cold fall (x2) days as independent, additive variables in a regression model; the genome-wide predictability score was used as the dependent variable (y). The best-fit model uses a spring max (S-Max) temp of ~32°C and a fall min (F-Min) temp of ~5°C and explains ~82% of the variation in the population predictability scores.

We then asked why predictability scores vary among populations. To address this question we built models that related the genome-wide predictability score (Figure 3A) for each population with a number of different potential explanatory variables (Figure 3B,C,D). We first considered a number of nuisance variables such as collection method and substrate, as well as contamination rates by D. simulans. None of these variables are correlated with the genome-wide predictability score more than expected by chance (Figure 3B). Latitude and longitude are also not strong predictors of variation in genome-wide predictability score (Figure 3B).

We next focused on an environmental model based on aspects of the thermal environment. The motivation for these characterizations of temperature is based on the hypothesis that populations with negative genome-wide predictability scores were exposed to particularly warm springs and/or particularly cool falls prior to our sampling. This hypothesis is plausible considering that short-term changes in temperature have been linked, either directly or indirectly, to dramatic and partly predictable changes in allele frequencies in the wild and in the laboratory in multiple Drosophilid species (Barghi et al., 2019; Bergland et al., 2014; Mallard et al., 2018; Rodríguez-Trelles et al., 2013; Tobler et al., 2014). We evaluated a number of features of temperature by calculating daily averages, average minimum and maximums, as well as the cumulative degree days (Figure 3B). Finally, we calculated the number of days below and above thermal limits, with the thermal limits defined across a range of possible values (Figure 3C).

We find that the only set of features of the population that is significantly correlated with genome-wide predictability score is the number of hot spring days and number of cold fall days prior to sampling. The best fit thermal limit model explains 82.2% of the variation in genome-wide predictability scores (Figure 3C and E) and beats >99.5% of permutations (false discovery rate = 17%) (Figure 3C). Specifically, the thermal limits identified by this best-fit model are the number of days in the 2 weeks prior to spring sampling with maximum temperature above ~32°C and the number of days in the fall with minimum temperature below ~5°C. These values roughly correspond to upper and lower thermal limits for flies (Hoffmann, 2010; Kellermann et al., 2012; Overgaard and MacMillan, 2017). Taken together, these results suggest that exposure to temperature extremes, or to associated environmental variables, acts as a selective pressure in the wild, causing large fluctuations in allele frequency throughout the genome.

Discussion

In this study, we have focused on the identification of genomic signatures of seasonal adaptation in D. melanogaster. Our approach relies on the detection of parallel changes in allele frequency from spring to fall across 20 populations. We use coarse definitions of spring and fall, with spring as the time of year close to when D. melanogaster first becomes locally abundant and fall as the time close to the end of the growing season but before populations become too scarce to sample (Behrman et al., 2015; Ives, 1970). We sampled populations across a wide geographic range, such that the length of the growing season as well as other environmental conditions varied substantially. Despite this environmental heterogeneity and the coarseness of the definitions of spring and fall, we detect clear signals of parallel seasonal allele frequency shifts across these populations (Figure 2A,B). Specifically, we demonstrate that seasonal adaptation is a general phenomenon of temperate D. melanogaster populations, and not restricted to a single orchard (Linvilla Orchard, Media, Pennsylvania: Behrman et al., 2015; Bergland et al., 2014; Cogni et al., 2014; Schmidt and Conde, 2006), biogeographic region (East Coast of North America: Behrman et al., 2018), or even a continent.

By detecting consistent seasonal changes in allele frequencies among multiple populations in North America and Europe, we can argue that seasonal adaptation across this whole range is at least partly driven by a consistent set of loci. Although parallel responses to selection in polygenic models of adaptation may be the exception rather than the norm due to redundancy (Barghi et al., 2019), we lack power to detect alleles that cycle in a small subset of populations or are private to a single population. We therefore focus only on the polymorphisms that segregate in all populations – and which consequently have a higher population frequency – and only look at the signal of parallel cycling. Thus, our conclusions are limited to the shared genetic basis of seasonal adaptation as revealed by the cycling of ~750,000 common SNPs which represent roughly half of all detected SNPs.

The detected seasonal SNPs can either be causal or, vastly more likely, linked to nearby causal genetic variants, be they SNPs, indels, structural variants, or transposable element insertions. We detected stronger seasonal signal near or inside inversions on chromosomes 2L, 3L, and 3R, suggesting that inversions are important drivers of seasonal adaptation (Dobzhansky, 1948; Dobzhansky, 1943; Kapun et al., 2016; Kapun and Flatt, 2019; Rodríguez-Trelles et al., 2013). Furthermore, the power to detect parallel seasonal evolution should be stronger for regions with reduced recombination, plausibly increasing the power of our approach to detect parallel adaptation linked to structural variants. Note that, by focusing on the subset of seasonal sites that also show clinal patterns of geographic differentiation, we were able to detect seasonal signal outside inversions as well. The cosmopolitan inversions span a substantial portion of the genome, and there is also likely to be substantial genetic diversity within these cytologically defined elements further complicating the inference. Thus, our study suggests that the complicated architecture of seasonal adaptation will require a much broader seasonal sampling of populations to fine-map the causal variants and that such an endeavor is not futile, given that it is possible to detect a set of loci that show consistent seasonal fluctuations with all the limitation of the approach notwithstanding. It is also clearly important in the future to obtain genomic data capable of identifying structural variation, including inversions, that plausibly underlie much of seasonal adaptation in Drosophila.

One key and poorly understood aspect of Drosophila seasonal adaptation is how the populations survive annual population collapse during the winter, manage to overwinter, and then manage to expand and recolonize the available resources during the summer (Ives, 1970; Shpak et al., 2010). Also poorly understood is the extent to which these patterns of collapse-overwintering-repopulation involve migration and refugia (Coyne and Milstead, 1987; Ives, 1970). However, it is clear that seasonal adaptation in D. melanogaster cannot involve wholesale population replacements by migration because the spatial structure of the D. melanogaster populations is robust year after year as evidenced by strong spatial population structure (Figure 1C) and persistent clinal patterns (Machado et al., 2016; van Heerwaarden and Hoffmann, 2007). While one can conceive of migration from local refugia playing a role, any such scenario still requires natural selection to operate in order to reset populations back to the ‘spring’ state every year. The full metapopulation dynamics and the ways that local D. melanogaster survive seasonal crashes and expansions remains to be understood.

Our ability to detect a shared genetic basis of seasonal adaptation across these broadly distributed populations implies that there are generic features in the environment that exert consistent selective pressures despite the environmental heterogeneity across sampling times and locations. It is possible that such generic selective pressures relate to temperature, humidity, population density, resource availability, or other biotic and abiotic factors (Bogaerts-Márquez et al., 2021). The seasonal and latitudinal parallelism in allele frequency change that we observe (Figure 2E) suggests that the environmental factors that drive these patterns must both shift between seasons locally and vary spatially from south to north.

While the exact nature of the environmental variables that drive seasonal selection remain to be determined, here we provide evidence that temperature extremes in the weeks prior to sampling are a strong proximate or even ultimate selective agent (Figure 3). Specifically, we show that the seasonal behavior of any one population could be statistically associated with temperature extremes in the 2 weeks prior to sampling both in the spring and in the fall. This model accounts for why several populations showed reversal of direction of movement of seasonal SNPs relative to the bulk of the populations; these populations had unusually warm springs and unusually cool falls prior to sampling. We thus hypothesize that, at the time of sampling, some of our spring samples had already experienced sufficiently strong selection in the generically summer way and that the fall samples had experienced selective pressures in the generically winter way. This would result in seasonal alleles in the spring already showing fall-like patterns and the ones in the fall showing spring-like patterns.

These findings also raise an intriguing possibility that natural selection may not push populations in the same consistent direction across the entire growing season but rather that natural selection and adaptation may be more heterogeneous through time (Grant, 2002; Nosil et al., 2020; Siepielski et al., 2009). In the future, more dense temporal sampling might help map the action of natural selection to more specific environmental factors and to quantify how fast the selective pressures change in strength and even direction across a single growing season. Our results thus illustrate that evolution and ecology do operate on similar timescales in this system (Hendry, 2020; Schoener, 2011; Thompson, 1998; Yoshida et al., 2003) and that evolutionary adaptation cannot be generally ignored a priori in demographic or ecological investigations even on extremely short timescales (Rudman et al., 2018).

While the number of detected seasonal loci is likely to be a vast overestimate of the number of causal seasonal variants, particularly if there is selection on large inversions, the strength of selection implied by the magnitude of fluctuations at seasonal sites is likely to be an underestimate. Partially linked loci should show a smaller magnitude of fluctuation than the true causal sites. Thus the 4–8% seasonal frequency shifts at the top 1% of the seasonal sites and the corresponding effective selection coefficients of ~10–30% per season (~1–3% per generation) imply even larger magnitudes of selection at the causal sites. Furthermore, given that we can only detect patterns of seasonal cycling that are at least partially consistent across all or most sampled populations, we are likely to underestimate the impact of linked seasonal selection due to selection that is occurring in a population-specific manner. While given our dataset it is impossible to know the true extent of polygenicity, our results suggest that seasonal adaptation acts on at least several but potentially a large number of dispersed loci affecting linked variation genome-wide. This seasonal selection force could rival in strength genetic draft due to selective sweeps and should be much stronger over seasonal timescales than random genetic drift in populations as large as D. melanogaster (Karasov et al., 2010).

Our observation of signals of seasonal adaptation at common SNPs demonstrates that polymorphism is maintained across the range of D. melanogaster despite strong fluctuating selection. The most straightforward explanation of these results is that these polymorphisms are subject to balancing selection (Bertram and Masel, 2019; Charlesworth, 2015; Gulisija and Kim, 2015, Levene, 1953; Wittmann et al., 2017). Pervasive balancing selection is consistent with the recent realization that simple models of mutation-selection balance are inconsistent with the extent of quantitative genetic variation in a variety of fitness-related traits (Charlesworth, 2015). A full understanding of the genomic consequences of balancing selection caused by strong fluctuating selection will require the identification of individual causal loci. Doing so will require substantial additional sampling across seasons as well as from closely and broadly distributed locales throughout the whole geographic range of D. melanogaster in combination with the genetic mapping of seasonally fluctuating phenotypes (Behrman et al., 2018; Behrman et al., 2015; Erickson et al., 2020; Rajpurohit et al., 2018; Rajpurohit et al., 2017; Schmidt and Conde, 2006). Finally, such efforts can be strongly augmented by manipulative field experiments and functional work to investigate the molecular effects of putatively causal genetic variants. All of this work will need to be done in the context of other processes that underlie persistence of populations in the face of environmental heterogeneity such as phenotypic plasticity and bet hedging (Ayrinhac et al., 2004; Bergland et al., 2008; Kain et al., 2015; MacMillan et al., 2015; Overgaard et al., 2011; Via and Lande, 1985). Overall, it is becoming clear that all of the key ecological and evolutionary processes, including demographic changes, mutation, migration, phenotypic plasticity, and rapid evolution by natural selection, can and do occur on very short and overlapping timescales and thus must be investigated as part of one complex evolving and interacting system.

Materials and methods

Population sampling and sequence data

Request a detailed protocol

We assembled 73 samples of D. melanogaster, 61 representing newly collected and sequenced samples, and 12 representing previously published samples (Bergland et al., 2014; Kapun et al., 2016). Locations, collection dates, number of individuals sampled, and depth of sequencing for all samples are listed in Supplementary file 1. For each sample, members of the Drosophila Real-Time Evolution Consortium (DrosRTEC) collected an average of 75 male flies using direct aspiration from substrate, netting, or trapping at orchards and residential areas. Flies were confirmed to be D. melanogaster by examination of the male genital arch. We extracted DNA by first pooling all individuals from a sample, grinding the tissue together in an extraction buffer, and using a lithium chloride – potassium acetate extraction protocol (see Bergland et al., 2014 for details on buffers and solutions). We prepared sequencing libraries using a modified Illumina protocol (Bergland et al., 2014) and Illumina TrueSeq adapters. Paired-end 125 bp libraries were sequenced to an average of 94× coverage either at the Stanford Sequencing Service Center on an Illumina HiSeq 2000 or at the Stanford Functional Genomics facility on an Illumina HiSeq 4000.

The following sequence data processing was performed on both the new and the previously published data. We trimmed low-quality 3’ and 5’ read ends (sequence quality < 20) using the program cutadapt v1.8.1 (Martin, 2011). We mapped the raw reads to the D. melanogaster genome v5.5 (and for D. simulans genome v2.01, flybase.org) using bwa v0.7.12 mem algorithms, with default parameters (Li and Durbin, 2009), and used the program SAMtools v1.2 for bam file manipulation (functions index, sort, and mpileup) (Li et al., 2009). We used the program picard v2.0.1 to remove PCR duplicates (http://picard.sourceforge.net) and the program GATK v3.2–2 for indel realignment (McKenna et al., 2010). We called SNPs and indels using the program VarScan v2.3.8 using a p-value of 0.05, minimum variant frequency of 0.005, minimum average quality of 20, and minimum coverage of 10 (Koboldt et al., 2012). We filtered out SNPs within 10 bp of an indel (they are more likely to be spurious), variants in repetitive regions (identified by RepeatMasker and downloaded from the UCSC Genome browser), SNPs with a median frequency of less than 1% across populations, regions with low recombination rates (~0 cM/Mb; Comeron et al., 2012), and nucleotides with more than two alleles. Because we sequenced only male individuals, the X chromosome had lower coverage and was not used in our analysis. After filtering, we had a total of 1,763,522 autosomal SNPs. This set was further filtered to include only SNPs found polymorphic in all samples (‘common polymorphisms’), resulting in 774,651 SNPs that represent the core set used in our analyses.

Due to the phenotypic similarity of the species D. melanogaster and D. simulans, we tested for D. simulans contamination by competitively mapping reads to both genomes. We omitted any of our pooled samples with greater than 5% of reads mapping preferentially to D. simulans (Charlottesville, VA 2012, fall; Media, PA 2013 fall). For the remaining samples, reads that mapped better to the D. simulans were removed from the analysis. For the populations used in this analysis, the average proportion of reads mapping preferentially to D. simulans was less than 1% (see Supplementary file 1), and there was no significant difference in the level of D. simulans contamination between spring and fall samples (t-test p=0.90).

In silico simulation analysis shows that competitive mapping accurately estimates the fraction of D. simulans contamination. To conduct these simulations, we used a single D. simulans genome from an inbred line derived from an individual caught California (Signor, 2017) and a single D. melanogaster genome from a DGRP line (Mackay et al., 2012). We used wgsim to generate simulated short reads from each genome and mixed these short reads together in various proportions. We then mapped the short reads back to the composite genome for competitive mapping, as described above. We calculated contamination rate as the number of total reads mapping to the D. simulans reference genome divided by the number of reads mapping to both D. simulans and D. melanogaster. These simulations demonstrate that the estimation of the cross species mapping is precise (Pearson’s r = 0.9999, p=1.6×10−24), but underestimates the true contamination rate by ~2%.

To assess seasonal variation, we analyzed population genomic sequence data from 20 spring and 20 fall samples (‘Core20’). These samples represent a subset of the sequenced samples. We used samples that had both a spring and a fall collection taken from the same locality in the same year. We also used a maximum of 2 years of samples for a given locality to prevent the analysis from being dominated by characteristics of a single population. When there were more than 2 years of samples for a given population, we chose to use the 2 years with the earliest spring collection time. This decision was made on the assumption that the earlier spring collection would better represent allele frequencies following overwintering selection. This left 20 paired spring/fall samples, taken from 12 North American localities spread across 6 years and 3 European localities across 2 years (Supplementary file 1). The localities and years of sampling are as follows: Esparto, California 2012 and 2013; Tuolumne, California 2013; Lancaster, Massachusetts 2012 and 2014; Media, Pennsylvania 2010 and 2011; Ithaca, New York 2012; Cross Plains, Wisconsin 2012 and 2014; Athens, Georgia 2014; Charlottesville, Virginia 2014 and 2015, State College, Pennsylvania 2014; Topeka, Kansas 2014; Sudbury, Ontario, Canada 2015; Benton Harbor, Michigan 2014, Barcelona, Spain 2012; Gross-Enzersdorf, Vienna 2012; Odesa, Ukraine 2013. For comparison of seasonal with latitudinal variation, we used sequence data from four spring samples along the east coast of the United States (Homestead, Florida 2010; Hahia, Georgia 2008; Eutawville, South Carolina 2010, Linvilla, Pennsylvania 2009). We performed a principal component analysis of allele frequency per SNP per sample using the prcomp function in R.

Identification of seasonal sites

Request a detailed protocol

We identified seasonal sites using three separate methods: a general linear regression model (GLM), a Bayesian model (Bayenv; Coop et al., 2010; Günther and Coop, 2013), and a model based on the rank Fisher’s method (RFM). Statistical analyses were performed in R (R Development Core Team, 2014).

To perform the GLM we used the glm function with binomial error, weighted by the ‘effective coverage’ per SNP per population (Nc) – a measure of the number of chromosomes sampled, adjusted by the read depth:

Nc=(1/N+1/R)1

where N is the number of chromosomes in the pool and R is the read depth at that site (Kolaczkowski et al., 2011; Feder et al., 2012; Bergland et al., 2014; Machado et al., 2016). This adjusts for the additional error introduced by sampling of the pool at the time of sequencing. The seasonal GLM is a regression of allele frequency by season (e.g., spring versus fall): yi = season + population+ei where yi is the allele frequency at the ith SNP, population is the unordered categorical variable denoting each of the 20 populations, and ei is the binomial error at the ith SNP. Although the GLM is a powerful test, the corrected binomial sampling variance (Nc) is likely an anti-conservative estimate of the true sampling variance associated with pooled sequencing (Machado et al., 2016). Therefore, we use the results of this test as a seasonal outlier test, rather than an absolute measure of the deviation from genome-wide null expectation of no allele frequency change between seasons. Results are compared to 500 permutations of the spring and fall labels within each population pair.

We tested for signals of seasonality using Bayenv. First, we calculated the genetic-covariance matrix between the Core20 set. Next, we calculated Bayes Factor scores for each SNP using the observed spring and fall labels. We used the read counts of reference and alternate based on the corrected binomial sampling variance (Nc). We conducted 100 permutations of the spring and fall labels within each population pair.

Finally, we developed a test to identify signals of seasonality which we call the ‘RFM' that is robust to the inflated sample size estimates that can arise when using pooled allele frequencies (Machado et al., 2016). For this method, we first perform a Fisher’s exact test for spring–fall allele frequency differentiation within each population pair. We then rank-normalize the resulting p-values for any given population by grouping SNPs into categories based on the number of total number of reads and the number of alternate reads, summed between spring and fall. The class-based rank of each SNP becomes its new p-value, providing uniform sets of p-values across the genome. As the power (minimum p-value) is relative to the number of SNPs being ranked per bin, and as the bin sizes are equivalent across populations, each population has equivalent power. Additionally, we find that this method is robust to inaccurate estimates of allele frequency precision (Figure 2—figure supplement 3). We then combine ranked p-values using Fisher’s method for each SNP, taking two times the negative sum of logged p-values across the 20 spring/fall comparisons (each tail tested separately). The null distribution for this statistic (X2) is a chi-squared distribution with degrees of freedom equal to 40 (two times the number of comparisons). We used the distribution of these per-SNP X2 test statistics to test for a seasonal signal above noise and compared to 200 spring/fall permutations.

Inversions

Request a detailed protocol

We analyzed seasonal variation associated with six major cosmopolitan inversions found in D. melanogaster: In(2L)t (2L:2225744–13154180), In(2R)NS (2R:11278659–16163839), In(3L)P (3L:3173046–16301941), In(3R)K (3R:7576289–21966092), In(3R)Mo (3R:17232639–24857019), and In(3R)P (3R:12257931–20569732). As the three inversions on 3R are overlapping, we combined them. We classified the genome into three categories: inversion breakpoints (500 kb ± each inversion breakpoint; 104 k SNPs), inside inversions (excluding breakpoint regions; 390 k SNPs), and outside inversions (excluding breakpoint regions; 281 k SNPs).

Matched controls

Request a detailed protocol

With the assumption that the majority of the genome is not seasonal, we can use matched genomic controls as a null distribution to test for enrichment of different features of seasonal polymorphisms. The use of matched control SNP sets was employed for the tests of seasonal and latitudinal parallelism. We matched each SNP identified as significantly seasonal (at a range p-values) to another SNP, matched for chromosome, effective coverage, median spring allele frequency (across populations), inversion status either within or outside of the major inversions In(2L)t, In(2R)NS, In(3L)P, In(3R)K, In(3R)Mo, and In(3R)P, and recombination rate. We used the same D. melanogaster inversion breakpoints used in Corbett-Detig and Hartl, 2012 and the recombination rates from Comeron et al., 2012. We randomly sampled 100 of the possible matches per SNP (excluding the focal SNP) to produce 100 matched control sets. Any SNPs with fewer than 10 possible matches were discarded from the matched control analyses. We defined 95% confidence intervals from the matched controls as the 3rd and 98th ranked values for the quantity being tested (e.g., percent concordance or proportion of genic classes).

Latitudinal cline concordance

Request a detailed protocol

To identify SNPs that changed consistently in allele frequency with latitude (clinal), we first identified SNPs that varied in allele frequency along a 14.4° latitudinal transect up the east coast of the United States. We used one spring sample from each of the following populations (identified as ‘Region_City_Year’): PA_li_2011 (39.9°N), SC_eu_2010 (33.4°N), GA_ha_2008 (30.1°N), and FL_ho_2010 (25.5°N).

First, we regressed allele frequency with population latitude in a general linear model (glm: R v3.1.0), using a binomial error model and weights proportional to the effective coverage (Nc): yi = latitude + ei where yi is the allele frequency at the ith SNP, and ei is the binomial error given the ith at the SNP. To confirm the robustness of the clinal GLM results, clinality also was assessed using a second, more complex model that accounts for existing covariance in the dataset (Bayenv: Coop et al., 2010; Günther and Coop, 2013).

We then tested the concordance in allele frequency change by season with the allele frequency change by latitude. We performed three separate seasonal regressions (see above) for comparison with the latitudinal regression: spring verses fall for the 18 non-Pennsylvania paired samples, spring versus fall for the three California paired samples, and spring versus fall for the three Europe paired samples. With the removal of the Pennsylvania samples, none of these three seasonal regressions contained samples from any of the four populations used for the latitudinal regression. Taking sets of increasingly clinal and increasingly seasonal SNPs, we assessed the proportion of sites that both increase in frequency from spring to fall and increase in frequency from north to south or that decrease in frequency from spring to fall and decrease in frequency from north to south. We compared this concordance with the concordance of 100 sets of matched controls.

Predictability analysis

Request a detailed protocol

To test the general predictability of seasonal change in our dataset, we performed a leave-one-out analysis. In this analysis, we performed seasonal regressions for subsets of the data, dropping one paired sample and comparing it to a seasonal test of the remaining 19. For the single population model, we used a Fisher’s exact test. For the multi-population model, we used a GLM. For either model, we used the effective number of reads (Nc, defined above). We then measured the percent concordance of spring/fall allele frequency change, defined as the proportion of SNPs that agree in the direction of allele frequency change between spring and fall. This was performed 20 times, once for each paired sample.

To estimate the predictability scores, we calculated the rate of change of the concordance score as a function of quantile threshold. To estimate this rate, we used the GLM with binomial error, concordancei = quantilei + ei, where ‘concordance’ is the fraction of SNPs falling below the ith quantile threshold of both the target (19 population) and test (1 population) models that changed in frequency in a concordant fashion and e is the binomial error with weights corresponding to the total number of SNPs falling below that threshold. Thus, the genome-wide (or chromosome specific) predictability score is heavily influenced by concordance rates of higher quantiles because that is where the bulk of SNPs reside.

We tested whether the variation in genome-wide predictability scores among populations is correlated with different features of the samples. We evaluated different nuisance parameters, such as collection method, substrate, and D. simulans contamination rate, as well as a number of features related to temperature. Daily minimum and maximum temperatures were obtained from the Global Historical Climatology Network-Daily (GHCND) database (Menne et al., 2012). We matched each locality to a weather station based on geographic distance. For the Core20 set of populations, four of the collections did not have precise collection dates for one or both seasonal collections (Athens, Georgia 2014; Media, Pennsylvania 2010, 2011; Barcelona, Spain 2012) or were not associated with any climate data from the GHCND database (Odessa, Ukraine 2013). These five populations were omitted from the analysis, leaving 15 populations for the remainder of the predictability analysis (Figure 3D). Using daily temperature data, we calculated the cumulative growing degree days following a model developed for D. suzukii by Wiman et al., 2014.

We regressed the predictability score onto each potential explanatory variable, recorded the proportion of variation explained (R2), and contrasted the observed R2 to null distributions generated via permutation (n = 25,000). For most variables, we performed a univariate model (e.g., latitude or longitude). In some cases, we used a bivariate model (e.g., latitude + longitude). To test the thermal limit model, we regressed the observed predictability score on the number of days above and below the thermal limits. We varied thermal limits from −5 to 40°C with a step size of 0.1°C.

Data availability

All raw sequence data have been deposited to the NCBI short read archive (SRA; BioProject Accession #PRJNA308584; accession numbers for each sample can be found in Supplementary file 1A). Code to conduct these analyses, primary results files, and code to reproduce the figures are available at https://github.com/machadoheather/dmel_seasonal_RTEC (copy archived at https://archive.softwareheritage.org/swh:1:rev:c4e0bc07a642c35470cd2c21b4f38f7ed0daa28d). VCF files with the raw allele frequencies per population and a R-data file of allele frequencies and effective sample sizes (Nc; compatible with scripts) are available on DataDryad (https://doi.org/10.5061/dryad.4r7b826).

The following data sets were generated
    1. Bergland AO
    (2016) NCBI BioProject
    ID PRJNA308584. Real Time Evolution Consortium: Tracking the tempo and mode of evolution over ecologically relevant temporal and spatial scales.
    1. Machado HE
    (2021) Dryad Digital Repository
    Data from: Broad geographic sampling reveals predictable, pervasive, and strong seasonal adaptation in Drosophila.
    https://doi.org/10.5061/dryad.4r7b826

References

    1. Haldane JBS
    2. Jayakar SD
    (1963)
    Polymorphism due to selection of varying direction
    Journal of Genetics 58:.
  1. Book
    1. Hendry AP
    (2020)
    Eco-Evolutionary Dynamics
    Princeton; Jackson: Princeton University Press Two Rivers Distribution.
  2. Book
    1. Levins RA
    (1974)
    Evolution in Changing Environments ; Some Theoretical Explorations, 3. Printing. Ed, Monographs in Population Biology
    Princeton, NJ: Princeton University Press.
  3. Software
    1. R Development Core Team
    (2014) R: A language and environment for statistical computing
    R Foundation for Statistical Computing, Vienna, Austria.
    1. Stalker HD
    (1976)
    Chromosome studies in wild populations of D. melanogaster
    Genetics 82:323–347.
  4. Book
    1. Tauber MJ
    2. Tauber CA
    3. Masaki S
    (1986)
    Seasonal Adaptations of Insects
    New York: Oxford University Press.

Decision letter

  1. Magnus Nordborg
    Reviewing Editor; Austrian Academy of Sciences, Austria
  2. Patricia J Wittkopp
    Senior Editor; University of Michigan, United States
  3. Magnus Nordborg
    Reviewer; Austrian Academy of Sciences, Austria

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

This paper shows convincing evidence of correlated seasonal changes in allele frequency over large geographical scales.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Broad geographic sampling reveals predictable, pervasive, and strong seasonal adaptation in Drosophila" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Magnus Nordborg as the Reviewing Editor and Reviewer #3, and the evaluation has been overseen by a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Jeffrey Ross-Ibarra (Reviewer #2).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife at this stage.

This decision is not based on the quality of the study, or of the importance of the work, but rather on problems with the analysis, which we think will require months to rectify". Therefore, our decision should be seen as "reject, but encourage to resubmit".

As is evident from the individual reviews below, there is no doubt that this is an important and interesting study. However, while we are generally inclined to believe most of your results, the quality of the analysis leaves much to be desired. It needs to be simplified, and the many ad hoc assumptions justified or dropped.

There are also serious concerns about sampling (reviewer #1) and about whether the proposed model would actually work.

We hope these comments are helpful.

Reviewer #1:

In this manuscript the authors present an analysis of seasonal allele frequency change in multiple population samples of D. melanogaster that have been sampled in spring and fall from a number of collection sites within North America and Europe. Briefly, the authors search for seasonality of allele frequencies using simple linear models and then look for coordinated changes among populations. The authors report decent overlap among "seasonal" snps, overlap with sites found to be clinal, as well as a number of ancillary results.

In general I think the dataset is quite interesting and the study to be timely. While that is so there are a number of technical issues throughout the paper and it suffers in numerous places from what I think are inappropriate statistical assumptions. While this is so I'm sympathetic to what the authors are attempting to do-it is hard to do correctly.

1) I have a number of concerns about the sampling design of this study:

a. First and foremost there is no consistency to the sampling date between years or localities. For instance the Fall samples from Linvilla, PA were collected either in October or November. Indeed a quick search for average temps reveals that at their collection site the average daily temperature differs by ~ 10 degrees F between those two dates. Thus I'm concerned that "season" is not a properly replicated factor across the design and instead there could be significant batch effects.

b. I'm concerned about the possibility of contamination from accidental collection of D. simulans that should vary systematically between the seasons (we know that the abundance of sim vs mel in local collections changes dramatically throughout the season. While the authors are rightly trying to account for this by competitive mapping more assurances should be given that this is working. I would like to see a simple simulation demonstrating the power of the competitive mapping proceedure-a straightforward way to do this would be to make synthetic pools of reads from melanogaster and simulans, while varying the percentage of simulans in the pool, and then take those simulated pools through the authors' mapping procedure. In addition, the authors should provide the percentage of reads that mapped to the sim assembly for each sample in Suppl Table 1. Once they have those they should test for an effect of season on that percentage.

c. I notice from the methods that collections varied with respect to baits used (banana or yeast), versus aspiration versus netting. The authors should provide the associated details on the Suppl Table as well check for batch effects again. It would be a shame if collection strategy were a hidden confounder.

d. Finally a question: the authors examine the genital arch of collected males to try to exclude simulans from the pools that way?

2) The heavy filtering of SNPs here will lead to some interesting ascertainment biases. Not only are they filtering out rare and private SNPs but the authors are requiring that the SNPs are found in EVERY population. We should expect this to be quite a biased set of SNPs, likely in the direction of enriching for balanced polymorphisms. While this is what the authors want to find I guess, it means that any statements regarding the percentage of SNPs that are under seasonally varying selection (and thus temporally balanced) will be significant overestimates. The authors at the very least will need to add significant caveats to their interpretations, but I would encourage them to consider redoing the analysis on an unfiltered set of SNPs as well, even if that means that the authors have to restrict portions of the analysis to within populations.

3) Line 742-I'm concerned that the authors are using too simple a regression here. They are assuming that allele frequency in fall is independent of allele frequency of spring within a population, which of course it is not. At the very least I think the authors' should be doing a repeated measures regression, but I wonder if there are even better ways to do this statistical testing, perhaps using a method of regression appropriate for autocorrelated timeseries observations.

4) Line 821-this model is not appropriate as observations of allele frequencies among populations are correlated via shared population history. The authors need to account for that covariance structure in this regression. See for instance Coop et al. 2010

5) Line 841-among population comparisons of the "rank p-value Fisher's method" test (can the authors come up with a better name?) are concerning as the authors are using number of reads as the data. If there are differences in read depth between samples, and there are, won't this test will have different power among different populations?

6) Lines 309-314. This analysis is unconvincing and the result quite weak. I note that the authors in this one tiny section are now presenting Pearson's R instead of R2. The coefficient of determination is very low for each sample here. Thus I feel that the authors are over interpreting what they have done.

7) The "Flipped model" strikes me as troubling philosophically. The authors' report a negative relation, so to make it line up with observations from other populations they are flipping the season labels? The authors need to do quite a bit of work to justify this as anything other than p-hacking, and the authors in my opinion should remove analysis of the "flipped" set from the paper.

Reviewer #2:

Overall this is a very nice study that convincingly shows an impact of seasonality on allele frequencies across many Drosophila populations, suggesting temporally-variable balancing selection may impact a large proportion of the Drosophila genome.

Are the estimates of the number of SNPs and strength of selection consistent with the observed patterns and decay of Fst? That is, how well does Fst and the decay of Fst in your simulations match the observed data?

Given these same parameters it might be fun (but admittedly speculative) to estimate what proportion of sites in the genome are affected by "seasonal draft".

What does the estimate of the strength of selection and number of sites affected tell us about load? Are local Drosophila population sizes sufficient that this is plausible without leading to extinction?

Of course the flipped model does provide more convincing evidence of some sort of seasonal effect, but I think I need a bit more convincing that the flipped model is justified other than the fact that it makes everything fit the preconceived model better.

How important is identication of individual causal loci (Line 653)? If the trait really is highly polygenic and there is limited concordance in loci among populations, how likely is this to be succesful? This doesn't seem to me the natural conclusion from the work presented. It seems to me that careful quant gen studies of phenotypes, selection gradients, and how Vg or Va change among populations and over time might be of more interest.

I'm generally not a fan of GO analyses but recognize that many readers will ask for such tests; I thought the paragraph presenting these results was appropriately cautious.

The definitions of spring and fall confused me somewhat. Are they based on per-location information on sampling abundance of Drosophila? It seems like there must be good weather station data for most of these locations; could a definition based on known thermal tolerances of Drosophila could be used?

What proportion of the ~1M SNPs filtered because they were not found across all populations show evidence of seasonal allele frequency change? It seems like population-specific variation is an interesting area that was left unexplored? Does this also explain some of the difference between Bergland 2014 and the present study?

Reviewer #3:

This manuscript addresses a classical question with unique data set. I was looking forward to reading it, but was disappointed by the statistical analysis, which I found both baffling and inappropriate.

I think the core of the problem is clearly and explicitly specified models and questions. Instead, you use a series of ad hoc tests that quickly made me lose faith in your conclusions.

It starts at the beginning (l. 186), where you perform a "test" of allele frequency change without specifying what you are testing, and why. After flipping between the 3-4 copies of the file I needed to have open to simultaneously look at text, figures, supplementary figures, and methods (which are not well written, see, e.g., the meaningless equation on l. 742), I think you are simply testing whether the spring frequencies at each SNP in each population differ significantly between spring and autumn under a very naive model, which you reject genome-wide (Figure 2A). But rather than tossing out this model and going back to the drawing board, you base all the remaining analyses on complicated intersections of p-values (which are inherently irrelevant in this paper, since you are not trying to identify individual loci).

This seems very odd. The main question of the paper is whether allele frequencies change consistently across populations, and the natural way to test this is using some kind of permutation scheme using the allele frequencies differences. You have naturally paired data, which probably can be treated as independent replicated (although this requires an argument about migration).

Having established this, why not fit the whole data set into a generalized mixed linear model, without a priori assumptions about the distribution of the random changes in allele frequencies (which must reflect both estimation error and random drift), and with explicit terms for selection (as a function of local climate) and (possibly) migration.

This would be so much easier to interpret than the ad hoc analyses you carry out, and would also avoid possible biases due to arbitrary p-value cut-offs for SNPs.

Using such an approach, it should be possible to estimate what fraction of the genome changes in a non-random fashion. Less sure about the strength selection, as I think this only has meaning within an accurate demographic model (which you do not have).

To sum up, I don't think you are doing your data justice here.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for submitting your work entitled "Broad geographic sampling reveals predictable, pervasive, and strong seasonal adaptation in Drosophila" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, including Magnus Nordborg as the Reviewing Editor and Reviewer #1, and the evaluation has been overseen by a Senior Editor.

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that our decision remains the same as after the first round: you are welcome to resubmit, but dramatic changes are (still) necessary. We are sorry if we didn't manage to explain this the first time around.

Although we remain convinced that your study may contain an important and interesting result – strong selection related to season/climate on a wide geographic scale – your analyses largely obscure this. A much simpler (and shorter!) paper that focused on convincingly demonstrating this fact would be much better.

Other major problems include the "flipped" analysis, which is not only post-hoc, but serves to illustrate that you should use climate data directly since the seasons obviously aren't really seasons, and the selection simulations which are based on models that are too simplistic.

Finally, arguing that the "haphazard" sampling is a strength, is, well, not convincing.

Our recommendation remains the same: given that the sampling is haphazard, simplify both claims and analysis in order to make the former rock solid. As you know, a recent pre-print by Buffalo and Coop argues that there is no signal in your first paper: here is your chance to prove that you were right after all.

Reviewer #1:

This version is much better than the previous one, but I still find the analysis confused and confusing. I will comment further below, but before getting to this, I should state clearly that I know how difficult this is. We are currently struggling with analogous data from Arabidopsis, and it clear that we lack an established framework for thinking about them. This is not textbook stuff.

A related, high-level comment is how ridiculously primitive we are from the point of view of basic ecology. We have no idea how selection is working, either in time or space, much less what the main selective pressures are. You write in your introduction that you "elected to solve these two related problems by (i) working as a consortium of Drosophila biologists (DrosRTEC or the Drosophila Real Time Evolution Consortium) and (ii) sampling in a somewhat haphazard manner." The first point is good, but the second is nonsense. The solution is surely not the kind of haphazard sampling described in this paper, but rather dense and non-random sampling in both time and space. (This is needed for Arabidopsis as well, btw)

When I first read this, I didn't even realize that all your samples were not collected in a single year. This raises the issue of why you don't take this into account. In our multi-year Arabidopsis field experiments, we find that there is often greater variability between years than between sites located near 1000 km from each other in a north to south direction (moving from deciduous forest to taiga).

First, what is the temporal correlation? The effect of space is clear from you PCA, but does time explain anything? How far do these flies move?

Second, you don't use climate data except to carry out your post hoc "flipping" of fall/spring labels. Did you try using climate data to explain what you see instead of a priori notions of spring and fall that you yourself dismiss as inadequate? We found that PCAs of multi-dimensional climate data was more strongly correlated with fitness than geographic proxies.

Another major unknown is the demographics of fruit flies. Your analyses assume (explicitly when you simulate selection) that each sampling location is a population (that dreaded population genetics abstraction that we all know doesn't really exist). You convincingly demonstrate that what you see cannot be due migration of randomly differentiated flies (because there is some kind concordance in changes globally), but this does not preclude that most of the allele frequencies be due to migration of selectively differentiated flies. Perhaps what happens is that there is selective die-off each winter, leading to cold-hardy survivors being overrepresented each spring – until they are swamped by southern migrants with vastly greater population sizes? Still selection, still adaptation, but with rather different predictions for the dynamics of allele frequencies. Talk about how fast populations adapt is less meaningful when we don't know what a population is. Returning to my initial point, to understand adaptation, we must understand the spatial and temporal scales over which allele frequencies chance. You don't have the sampling density to do this.

Glass half full: if you really see consistent changes despite all this, there must be some BIG signals out there…

I am convinced that you do see such changes, but my main comment remains that I think you could learn more from these data if you took several steps back and thought carefully about alternative models, and the simplest possible way to test. I suggested using the paired structure (you still need to show that pairs can be treated as independent, btw), but I think you can take it further than you have in estimating effects at individual SNPs. But it is not my job as reviewer to tell you how I would approach this, but rather to check whether I think your claims hold.

Point-by-point then, I am convinced by your permutations that there are large coordinated changes, but you still need to discuss the possible role of spatial and temporal correlations between pairs. Also think this point could be made better by going directly to estimates of allele frequency change, or beta from regression rather than involving p-values.

I find the whole "predictability" analysis convoluted and unconvincing. An attempt to shoehorn observations into an a priori framework. The leave-one-out analysis also requires some discussion of whether these points are independent, and I think it would not be necessary if you directly estimated consistent effects using permutation.

The comparison with clinal data is interesting, but again suffers from too strong priors of what is going on. You have climate data – why not simply check whether there are variables that explain both spatial and temporal shifts?

I find your analysis of the strength of selection across the genome unconvincing for the reasons outlined above: you have absolutely no idea of the demographic model, and treating each population as a close system (i.e. generating fall by sampling from spring) is simply not warranted.

The analysis of which SNPs are under selection is not convincing. The complete lack of correspondence between the original and the "flipped" model strongly suggests that the peaks are not to be trusted. Here is where jack-knifing and bootstrapping might be useful: to assess the robustness of your p-value estimates. My guess is that while your data is strong enough to show that there is selection in aggregate, getting down to individual loci is not possible. This also militates against relying of high-significance filters rather than the whole distribution of effect sizes.

Reviewer #2:

I have read the revised submission and the responses to review carefully. I still have major issues with the analysis done in this manuscript.

1) The authors still provide textbook case of p-hacking through their "flipped" analysis. Their justification-that the leave-one-out analysis tipped them off to a flipped effect-is not a justification at all. This entire section of the paper needs to be removed.

2) The binomial GLM that the authors are doing is inappropriate, as I pointed out in the initial submission. The "paired-fashion" of sampling helps nothing-the authors are simply testing against the null of their being no-difference in frequency between seasons assuming a binomial error model. The binomial error model here is inappropriate because spring and fall are not independent draws from the same parameterized distribution-allele frequency change is expected to occur do to drift. The authors are not accounting for this and it is a major flaw. I would suggest at the very least that the authors using a Nicholson et al. type framework for accounting properly for allele frequency change between seasons.

3) The justification that the authors give about "haphazard sampling" is risible. Adding noise does not add "inferential power" as the authors claim on line 174

4) The issue of biased ascertainment of SNPs has not been dealt with. The authors simply give their same estimates of genome-wide numbers of SNPs affected by seasonal selection and then follow it with a caveat. All such genome-wide estimates should be removed-you can't estimate them given your ascertainment conditions. Moreover you say on line 235 "Whether this SNP selection process generates bias in our estimates […] remains to be determined." This is unacceptable- your ascertainment definitely creates bias- this language brushes that under the rug.

5) The code for the permutation routines and the control SNP matching needs to be shared.

6) Lines 293-295 The authors are reporting that permution suggests the p-value of enrichment > 0.1. This suggests chance and nothing else is responsible for the observed effect despite the authors' conclusion of "robust evidence that parallele seasonal adaptation is a general feature…"

7) Line 309-no enrichment test has been described.

8) Line 330-Citing Gould here is silly. Pick a more appropriate citation.

9) Supplemental Figure 7-in the Bayesenv analysis that I asked for the "All" curve looks very different and goes against the conclusion that the authors are making. No explanation is given.

10) The ABC analysis, now described, is not using proper population genetic simulations of allele frequency change per generation due to drift + selection. As written there is only one generation of drift. This needs to be changed to take this analysis even moderately seriously. Moreover the code for these simulations needs to be shared.

11) Line 1020-In the corcordance regression the authors are doing a binned regression-this is never appropriate and the authors need to redo this analysis without binning.

Reviewer #3:

Overall this is an improved manuscript. Easier to read and follow, and better explained. There are several points I think that should still be addressed.

I am still not a fan of the flipped model. I agree that some of the evidence (predicting into the validation set, etc.) does indeed argue it's a better fit, but it still feels like ad-hoc subjective tweaking of the data until it fits well. I would prefer it to be removed from the paper -- I think show the original model and point out that some population show the reverse pattern and that matches with temperature. Perhaps even include the flipped model in the supplement. I would find that more convincing than the flipped model I think. In either case, the paragraph starting on line 445 should go, as even the authors admit this doesn't really show anything meaningful. The flipped model should also be removed from figures 2A and 2B as again it will by definition show a more convincing signal here.

As an alternative to the flipped model or presenting the data with the course labels of spring and fall, why not actually model the temperature data available? It would seem an objective a priori approach that should allow for differences in the flipped populations (i.e. presumably the difference in temperature the 3 weeks prior between Fall and Spring behaves differently for the flipped populations). Perhaps use mean temperature in the 3 weeks prior and/or the slope of the change of temperature over that time? I'm sure there are more creative/intelligent options, but I don't quite understand why the authors can't use this data instead of grossly categorizing things as spring or fall. I didn't see a good reason for not doing so in the response?

I find the authors treatment of enrichment odd. In some places it is presented as convincing evidence, and in others (line 563) it is disregarded because of absolute numbers. The logic on line 563 is fine of course, but I would like to see enrichment treated the same way throughout. On line 477 it is convincing as a log odds score, and in the paragraph starting on line 318 a modest percentage enrichment is considered good evidence.

I'd like to see a bit more exploration of the clustering. Figure 5D (visually) and the 100kb window analysis seem to suggest that clustering is on a relatively large scale, yet the analysis presented on 629 for % genome and s only investigates 5kb windows. If I'm understanding the ABC correctly it should be pretty fast to run, and it seems like running it on 50, 100, or even 500kb scale might be of interest. (To my eye some of the figures in S10 start to suggest a flattening of the ridge when done at 5kb scale). Certainly the data do appear to argue for a polygenic architecture, but whether this is ~50 windows or 5% of the genome I think isn't well differentiated.

Line 800: I agree with this logic about temperature and why some populations behave differently. I would have liked to see this prediction about temperature earlier in the introduction. Naively my first impression was that Fall populations would be adapted to cooler conditions and Spring to warmer. I see now why that is wrong, but I think stating up front that Fall populations are expected to reflect adaptation to warm summers would help some readers.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Broad geographic sampling reveals predictable, pervasive, and strong seasonal adaptation in Drosophila" for further consideration by eLife. Your revised article has been evaluated by Patricia Wittkopp as the Senior Editor and Magnus Nordborg as the Reviewing Editor.

Almost there! As you will see from the comments below, we got a fresh 3rd reviewer, who picked up on something we agree should be addressed, namely the likely role of inversions. We have two suggestions for how to proceed. Either: a) go through the manuscript and make sure you emphasize that much of the signal may be driven by inversions, and that it is impossible to know how polygenic this really is, or; b) provide additional analyses (e.g., of the X chromosome), to demonstrate that there is a signal independent of inversions.

Reviewer #1:

The authors were trying to confirm preliminary results suggesting that environmental changes accompanying seasonal change drive genome-wide allele frequency changes in Drosophila. This would give new insight into how selection works, and what factors might maintain genetic variation – at least in short-lived organisms. Although the detailed mechanisms are obscure, the authors use parallel changes over large geographic distances to argue convincingly that some form of seasonal selection must be taking place.

This is the third time I see this manuscript, so I will say no more than that it is greatly improved. I'm happy with it.

Reviewer #2:

This is a much improved, clearer version of the manuscript. The analyses are simpler and better explained, and the results I think come out clearer as a result.

Reviewer #3:

The strongest seasonal signal comes from inversions. If inversions are responding to seasonal selection, it is not surprising that the authors find parallel SNP changes across populations as the same inversions are shared globally. Unless the authors refocus the manuscript on parallel selection on inversions, their analyses need to be modified: almost all analyses use the full SNP set, but to study real parallel selection responses on the SNP level, the authors need to restrict their analysis on SNPs, which are not affected by inversions. To this end, it is important to keep in mind that inversions may suppress recombination also outside of the inversion, which makes it a bit challenging to determine the autosomal fraction that is not affected by inversions. A much better strategy would be to analyze the X-chromosome, which is the only major chromosome free of inversions. Unfortunately, the authors excluded this chromosome from their analyses.

Anyway, inspection of Figure 2D shows that the signal for seasonal SNPs is erased for regions outside of the inversions. Furthermore, a significant concordance pattern between seasonal and clinal SNPs outside of the inversion is restricted to 2L and 3R, the chromosomes with the strongest inversion effects. This could be interpreted as an effect of inversions on the genomic regions flanking the inversion.

How do the authors interpret a (presumably significant) underrepresentation of concordance SNPs on 3L?

Apart from my doubts about the significance of the seasonal selection signal, I would like to come back to the novel aspect of the manuscript-sharing seasonal SNPs across populations. The authors highlight, probably correctly, that seasonal adaptation is polygenic. This raises the question of whether parallel selection signatures are expected in differentiated populations. In my opinion two lines of reasoning speak against it: 1) probably more variants are segregating in the populations than required for seasonal adaptation (redundancy) 2) the frequencies of the seasonal SNPs most likely differ between the populations. Hence, SNPs closer to 50% are expected to respond more to the same selection pressure than SNPs with more extreme allele frequencies. This will lead to different power to detect the same selection response in differentiated populations.

Analyze the X-chromosome.

Remove the second season from the locations where two spring-fall pairs were included-only this makes the comparison unbiased.

Evaluate whether the spring-fall permutations remove the statistical issues of the GLM and Fisher's exact tests mentioned by the previous reviewers. Clarify that the matched controls were done on a sample basis, rather than across samples.

Clarify that the effective coverage was calculated per SNP.

The authors cite theoretical work, which suggests that seasonal SNPs may be maintained for highly restricted conditions (changing dominance)-do they find empirical support that these conditions are met in their data?

The significance of the manuscript to a broader audience could be increased by:

– A statement that the seasonal selection response is restricted to inversions-but I doubt that this is the message the authors would like to portray.

– A general discussion about the expectations of parallel selection signatures on the SNP level across populations and why the authors expect to see it (or find it against the expectations).

https://doi.org/10.7554/eLife.67577.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Reviewer #1:

[…] 1) I have a number of concerns about the sampling design of this study:

a. First and foremost there is no consistency to the sampling date between years or localities. For instance the Fall samples from Linvilla, PA were collected either in October or November. Indeed a quick search for average temps reveals that at their collection site the average daily temperature differs by ~ 10 degrees F between those two dates. Thus I'm concerned that "season" is not a properly replicated factor across the design and instead there could be significant batch effects.

The reviewer brings up an important point about the nature of seasons. A discussion of what defines a “season”, as it relates to the evolutionary dynamics of flies, is a major thread throughout the manuscript. In this revision, we have strengthened our treatment of this central topic throughout the Introduction, Results, and Discussion. Here, we list the text that we have added to the manuscript:

1. In the introduction (P4, L139-151), we highlight two challenges in studying seasonal evolution in multiple fly populations:

“Examining patterns of allele frequency change on seasonal timescales across a species’ range faces two problems. […] While it is possible, albeit logistically difficult, to sample flies according to some matching of seasons using the calendar window (e.g. first week of June for the “spring” and the first week of November for the “fall”) or using a pre-determined physiological time (e.g., “spring“ is the time after 21 cumulative growing degree days), it is not at all clear that any such matching would be meaningful in terms of the ecology, physiology, and evolution of local populations.”

2. With this statement of the challenges of defining seasons and sampling across them in multiple populations articulated, we go on to discuss the role of consortia based sampling (P4, L153-162) and then describe in detail how haphazard sampling is both logistically necessary and conceptually important (P4, L164-167):

“We chose to carry out haphazard sampling for logistical and inferential reasons. In this study, the “spring” sample was generally taken close to the time when the populations of D. melanogaster become locally abundant. […] Thus, a haphazard approach might in fact have the greatest inferential power provided that the signal of seasonal adaptation can be detected in such a complex dataset

We recognize that the reviewer is bringing up an important point about proper replication, however we also hope that the reviewer understands that we are sampling flies from natural habitats and that short-term changes in environmental factors will always exist. To reiterate the argument that we advance in the manuscript, the semi-predictable changes in selection pressure observed in this system provides an ideal opportunity to decouple, and therefore identify, the environmental factors acting as strong selective agents in the wild.

b. I'm concerned about the possibility of contamination from accidental collection of D. simulans that should vary systematically between the seasons (we know that the abundance of sim vs mel in local collections changes dramatically throughout the season. While the authors are rightly trying to account for this by competitive mapping more assurances should be given that this is working. I would like to see a simple simulation demonstrating the power of the competitive mapping procedure-a straightforward way to do this would be to make synthetic pools of reads from melanogaster and simulans, while varying the percentage of simulans in the pool, and then take those simulated pools through the authors' mapping procedure. In addition, the authors should provide the percentage of reads that mapped to the sim assembly for each sample in Suppl Table 1. Once they have those they should test for an effect of season on that percentage.

The reviewer correctly points out that accidental collection of D. simulans can occur, and if the probability of such contamination varies between seasons this may influence our inferences of seasonal adaptation. We have taken the following steps to address the reviewer’s concerns:

1. We conducted simulations showing that we can accurately estimate the fraction of D. simulans reads through competitive mapping. This work is now described in the Materials and methods (page 23, L912-926) and Supplemental Figure 12:

“In silico simulation analysis shows that competitive mapping accurately estimates the fraction of D. simulans contamination. […] The level of residual D. simulans contamination does not correlate with contamination rate and is roughly 9% (Supplemental Figure 13B).”

2. We show that the local density of seasonal polymorphisms is not correlated with the rate of D. simulans reads mapping to the D. melanogaster genome even with competitive mapping (page 23, L926-928):

“Additionally, the proportion of seasonal sites (top 1%) is not correlated with the rate of residual mapping of D. simulans reads to the D. melanogaster genome as inferred through our simulation (r = -0.0026, p=0.93)”.

3. We show that there is no bias in the percent of D. simulans contamination between our spring and fall samples (t-test p=0.90; added to P23, L907-910):

“For the populations used in this analysis, the average proportion of reads mapping preferentially to D. simulans was less than 1% (see Supplemental Table 1), and there was no significant difference in the level of D. simulans contamination between spring and fall samples (t-test p=0.90).”

4. We separated D. melanogaster from D. simulans males by examination of the genital arch, and only sequenced those individuals likely to be D. melanogaster. Residual D. simulans contamination is likely due to human error in this process. We state clearly that we use the gential arch on P22 L876:

“Flies were confirmed to be D. melanogaster by examination of the male genital arch.”

5. As in the previously submitted manuscript, we excluded any samples with high D. simulans. We state this in the Materials and methods. Intriguingly, both of the excluded samples were from the fall when D. simulans was generally thought to be in low abundance (Schmidt 2011). We interpret the high D. simulans contamination rate from fall samples to be a consequence of poorly tuned priors on the part of the person doing the sorting. We have decided to leave this interpretation out of the manuscript, however.

c. I notice from the methods that collections varied with respect to baits used (banana or yeast), versus aspiration versus netting. The authors should provide the associated details on the Suppl Table as well check for batch effects again. It would be a shame if collection strategy were a hidden confounder.

We have added collection substrate, method, and location type (residential, orchard, etc.) to Supplementary Table 1.

We also tested whether collection locale (orchard vs. residential), collection method (aspirator, net, trap), collection substrate (apple, banana, compost/pomace pile, windfall fruit, other [cherries, peaches, mixed fruit]) in the spring or fall collection are correlated with genome-wide predictability score. Using a simple ANOVA, the smallest p-value across all tests is 0.45. We have added this analysis to the main text where we test whether predictability scores are correlated with other environmental values (P10-11, L401-407):

“However, genome-wide predictability scores are not correlated with spring or fall collection date (p = 0.6 and 0.9, respectively), cumulative growing degrees (p = 0.5, 0.86), collection locale (residential vs. orchard, p = 0.6, 0.4), collection method (aspirator vs. net vs. trap, p = 0.9, 0.9), collection substrate (contrasting various types of fruit, compost, p = 0.8, 0.9), percent D. simulans contamination (0.8, 0.8), latitude (p = 0.9), or the difference in cumulative growing degree days between spring and fall (p = 0.46)”

We note that there may be biologically meaningful interactions between the aspects of the environment mentioned above which could contribute to the observed signals of seasonal adaptation. And, we completely agree with the reviewer that this is an important point. We hope that once the sample size of the seasonal dataset goes up to hundreds or thousands such analyses would become powerful enough to identify the effects of the myriad selective forces underlying rapid adaptation in the wild.

d. Finally a question: the authors examine the genital arch of collected males to try to exclude simulans from the pools that way?

Yes, we did. Please see above.

2) The heavy filtering of SNPs here will lead to some interesting ascertainment biases. Not only are they filtering out rare and private SNPs but the authors are requiring that the SNPs are found in EVERY population. We should expect this to be quite a biased set of SNPs, likely in the direction of enriching for balanced polymorphisms. While this is what the authors want to find I guess, it means that any statements regarding the percentage of SNPs that are under seasonally varying selection (and thus temporally balanced) will be significant overestimates. The authors at the very least will need to add significant caveats to their interpretations, but I would encourage them to consider redoing the analysis on an unfiltered set of SNPs as well, even if that means that the authors have to restrict portions of the analysis to within populations.

We agree with the reviewer and thank them for bringing up this interesting and important point. We now discuss the limitations of our analysis, specifically with respect to ascertainment bias caused by minor allele frequency cut-off and the requirement for polymorphism in all populations. We have now addressed this point at several places in the manuscript and have performed additional analyses. We outline these changes here:

1. We acknowledge the general problem in the beginning of the results where we write on P6 L234-237:

“Whether this SNP selection process generates bias in our estimates of the strength and magnitude of seasonally variable selection remains to be determined but note that the smaller set still represents approximately half of all detected SNPs.”

2. Again, on P15 L610-615, where we write:

“We note that our analysis is potentially biased because we are conditioning on polymorphisms that are present in all populations tested, including North American and European ones with different long-term dynamics and histories of colonization, expansion, and admixture (Keller 2007; Bergland et al. 2016; Kapopoulou et al. 2018). The filtering requirement that we impose may therefore bias our SNP set to those which are subject to balancing selection, causing an upward bias in our estimates.”

3. We discuss our how estimates of the strength and extent of seasonal allele frequency changes are qualitatively similar when we include the broader set of 1.75M SNPs (P16 L615):

“We note, however, there are no qualitative differences found as a function of filtering for common SNPs (compare Figure 4B to Supplemental Figure 9).”

The reviewer also raises the interesting suggestion of performing analyses of seasonal adaptation within populations. On page P20 L 771-765 we discuss how the dataset that we have assembled only has high power for analysis across populations:

“Note that we generally lack power to detect alleles that cycle in a small subset of populations or are private to a single population. […] Thus, our conclusions are limited to the shared genetic basis of seasonal adaptation as revealed by the cycling of common SNPs.”

3) Line 742 – I'm concerned that the authors are using too simple a regression here. They are assuming that allele frequency in fall is independent of allele frequency of spring within a population, which of course it is not. At the very least I think the authors' should be doing a repeated measures regression, but I wonder if there are even better ways to do this statistical testing, perhaps using a method of regression appropriate for autocorrelated timeseries observations.

We thank the reviewer for their helpful suggestion. However, we believe that our model is appropriate because we are treating the data in a paired fashion.

4) Line 821 – this model is not appropriate as observations of allele frequencies among populations are correlated via shared population history. The authors need to account for that covariance structure in this regression. See for instance Coop et al. 2010

We have taken the reviewer’s recommendation and performed an additional analysis using the recommended bayenv regression. This new analysis does not substantially change the results or any of the conclusions. We have added this comparison to the manuscript (P13, L521-525; Supplementary Figure 7):

“Seasonally varying SNPs were identified among these 18 populations using both the original and flipped seasonal labels. […] We find that the rate of parallelism increases with increasingly stringent seasonal and clinal significance thresholds (Figure 3A) using either model of clinality (Supplemental Figure 7).”

5) Line 841 – among population comparisons of the "rank p-value Fisher's method" test (can the authors come up with a better name?) are concerning as the authors are using number of reads as the data. If there are differences in read depth between samples, and there are, won't this test will have different power among different populations?

The ranking of the Fisher’s exact test statistic (which produces the rank p-values that are the input of the Fisher’s method test) is performed within each population. The power is relative to the number of observations (SNPs) being ranked. Within each population, ranking occurs among SNPs with similar allele frequencies and depths (see methods for binning). The number of bins is the same for each population, and each population has equivalent power. We have clarified this in the Materials and methods, P27 L1096-1098:

“However, as our rank p-value Fisher’s method relies only on the rank within a paired sample, and the consistency of the rank across paired samples, there is not likely to be an artificially inflated seasonal signal due to incorrectly estimated sampling error”.

We use the acronym, “RFM” throughout the manuscript after defining it once.

6) Lines 309-314. This analysis is unconvincing and the result quite weak. I note that the authors in this one tiny section are now presenting Pearson's R instead of R2. The coefficient of determination is very low for each sample here. Thus I feel that the authors are over interpreting what they have done.

We appreciate the reviewer’s prudent interpretation of this result. It reflects our own cautious interpretation that we included in the original manuscript and remains in the current one on P11, L424 (“While this correlation of three points is not significantly different from zero”.)

We also note that the use of R instead of R2 is appropriate here because we care about the sign of the correlation between the model based and observed genome-wide predictability scores and not claiming that our model explains a substantial part of the total variance.

We also wish to remind the reviewer that the analysis in question is part of a larger analysis examining the validity of our thermal limit model which led to the development of the ‘flipped’ model. The ‘flipped model’ of seasonal allele frequency change has a substantially stronger signal when assessed using various, orthogonal tests that we present in the paper.

7) The "Flipped model" strikes me as troubling philosophically. The authors' report a negative relation, so to make it line up with observations from other populations they are flipping the season labels? The authors need to do quite a bit of work to justify this as anything other than p-hacking, and the authors in my opinion should remove analysis of the "flipped" set from the paper.

We respectfully disagree with the reviewer that we are engaged in p-hacking. Indeed, the flipped model arose naturally out of our leave-one analysis that we perform. In that analysis, we observed a prominent signal that several populations display strong allele frequency changes in the “opposite” direction from the rest of the paired samples. That result, we believe, is unambiguous.

We would like to note that we incorrectly stated that this leave-one out analysis represents a validation of the overall results. Instead we use it – and now describe it as such – to test whether the signal comes from just a few populations or from the sample as a whole and whether individual populations conform to that signal. We found that most do but some do not and show as strong but reverse, “flipped” pattern.

We then further show that biologically plausible thermal limits are correlated with this flipping phenomenon. Of course, another plausible explanation is that the season labels were mistakenly swapped at some point in sample handling or library preparation. While possible, we believe this is unlikely as there is no reason for why such a mistake would take place in a way that relates to the thermal model.

Regardless of the cause of the flipping, it is clear that the flipped model offers substantial improvement in the signal of seasonal evolution, even for tests which are orthogonal (e.g., clinal concordance and ability to predict an independent dataset from Bergland 2014 PloS Genetics paper). Note that we now discuss this later possibility in the manuscript where we write on P11, L408-411:

“Although we cannot rule out other explanations such as inadvertent sample swapping, our analysis is generally consistent with a model which suggests that changes in selection pressure within and among seasons may lead to dramatic changes in allele frequencies.“

Reviewer #2:

[…] Are the estimates of the number of SNPs and strength of selection consistent with the observed patterns and decay of Fst? That is, how well does Fst and the decay of Fst in your simulations match the observed data?

Given these same parameters it might be fun (but admittedly speculative) to estimate what proportion of sites in the genome are affected by "seasonal draft".

This is an interesting question and suggestion but at the moment is outside the scope as the simulations were conducted using unlinked loci in order to provide us with a sense of statistical power and the number of seasonal sites and the range of fluctuations consistent with our data. The more realistic model that would include linkage are much more challenging to carry out and the ability to fit the data to the model is limited by the nature of pooled datasets. We hope to carry out such an analysis in the future with the data that includes haplotype information.

What does the estimate of the strength of selection and number of sites affected tell us about load? Are local Drosophila population sizes sufficient that this is plausible without leading to extinction?

We refer the reviewer to Wittmann et al. PNAS 2017, where we discuss how to interpret the plausibility of strong, multilocus seasonal evolution in the context of load. We also refer the reader to Bergland et al. PloS Genetics 2014 where we discuss various plausibility arguments. Importantly, as we now discuss in the Results and Discussion, we do not claim that the detected sites are all causal and in fact believe that the majority must be cycling due to linked selection. The answer to the reviewer’s very good and appropriate question will need to wait until we and others arrive at a set of likely causal loci – such a fine mapping will require much larger datasets, and additional studies that complement sequencing, that we hope to collect in the future.

Of course the flipped model does provide more convincing evidence of some sort of seasonal effect, but I think I need a bit more convincing that the flipped model is justified other than the fact that it makes everything fit the preconceived model better.

The reviewer brings up an interesting point about the flipped model. We wish to highlight the following points:

1. The leave-one-out analysis which motivated the flipped model presents a striking feature of the data – that some populations have strong allele frequency fluctuations at many loci that go in the opposite direction from the remaining samples.

2. We provide evidence that the flipped model provides stronger signals of seasonal adaptation in analyses that are orthogonal to the flipping, i.e., the ValidationSet analysis, the clinal analysis, and enrichment with seasonal SNPs identified by Bergland et al. 2014.

3. The pre-conceived model, of course, is that spring is spring. The flipped model provides support for an alternative model in which local weather conditions provide a better predictor of adaptive dynamics over the growing season of flies than our limited definition of seasons.

How important is identification of individual causal loci (Line 653)?

We agree that the identification of the causal loci is not of core importance in this study. Ultimately, the identification of individual causal loci will be important because it will allow us to understand how strong selection on highly polygenic quantitative traits is realized at the genomic level. That being said we do succeed in identification of some regions and in some cases (such as for Tep 2,3) we have prior data suggestive of functional importance. We do believe that this analysis is important to do just to emphasize that the selection is polygenic and much additional data will be required to really identify individual loci.

If the trait really is highly polygenic and there is limited concordance in loci among populations, how likely is this to be successful?

It is unclear how successful this endeavor will be, especially given the current limitations of association mapping in Drosophila. This is an excellent question but cannot be addressed in the current study.

This doesn't seem to me the natural conclusion from the work presented. It seems to me that careful quant gen studies of phenotypes, selection gradients, and how Vg or Va change among populations and over time might be of more interest.

We agree and thank the reviewer for the interesting suggestion. This work is currently outside the scope of the present study but ultimately full understanding of the fluctuating selection will require such an integration.

I'm generally not a fan of GO analyses but recognize that many readers will ask for such tests; I thought the paragraph presenting these results was appropriately cautious.

Thank you! We are not fond of GO analyses for such types of data ourselves and the reviewer is correct that many people are curious about the results of this test. Therefore, we have worded this sentence cautiously.

The definitions of spring and fall confused me somewhat. Are they based on per-location information on sampling abundance of Drosophila? It seems like there must be good weather station data for most of these locations; could a definition based on known thermal tolerances of Drosophila could be used?

The reviewer makes and interesting point and we would refer them to discussion of this point during numerous passages throughout the manuscript. See our response to Reviewer 1’s first point.

What proportion of the ~1M SNPs filtered because they were not found across all populations show evidence of seasonal allele frequency change? It seems like population-specific variation is an interesting area that was left unexplored? Does this also explain some of the difference between Bergland 2014 and the present study?

This is an extremely interesting question, which unfortunately we do not have much power to explore. Identifying seasonal SNPs in single populations is challenging with only end-point data because of limitations in power. Note that we do not believe that the differences between this work and Bergland et al. 2014 can be explained by the use of private SNPs as the work by Bergland et al. 2014 relied on heavier filtering, with MAF across all populations (3 years of PA plus the clinal samples) > 0.15.

Reviewer #3:

This manuscript addresses a classical question with unique data set. I was looking forward to reading it, but was disappointed by the statistical analysis, which I found both baffling and inappropriate.

I think the core of the problem is clearly and explicitly specified models and questions. Instead, you use a series of ad hoc tests that quickly made me lose faith in your conclusions.

It starts at the beginning (l. 186), where you perform a "test" of allele frequency change without specifying what you are testing, and why. After flipping between the 3-4 copies of the file I needed to have open to simultaneously look at text, figures, supplementary figures, and methods (which are not well written, see, e.g., the meaningless equation on l. 742), I think you are simply testing whether the spring frequencies at each SNP in each population differ significantly between spring and autumn under a very naive model, which you reject genome-wide (Figure 2A). But rather than tossing out this model and going back to the drawing board, you base all the remaining analyses on complicated intersections of p-values (which are inherently irrelevant in this paper, since you are not trying to identify individual loci).

We thank the reviewer for their comments and helpful suggests for clarifying our manuscript. We have taken these comments to heart and have thoroughly cleaned up the analyses and simplified and streamlined their presentation. We have modified the Materials and methods to make it more comprehensive, and have also substantially re-written the Introduction, Results, and Discussion. We also moved the figures into the main body of the text to aid in in the review process.

This seems very odd. The main question of the paper is whether allele frequencies change consistently across populations, and the natural way to test this is using some kind of permutation scheme using the allele frequencies differences. You have naturally paired data, which probably can be treated as independent replicated (although this requires an argument about migration).

We agree and in fact this is exactly what we do. We now make this key point much more clearly and we thank the reviewer for pointing out that this key part of the analysis was obscure in the previous version.

Having established this, why not fit the whole data set into a generalized mixed linear model, without a priori assumptions about the distribution of the random changes in allele frequencies (which must reflect both estimation error and random drift), and with explicit terms for selection (as a function of local climate) and (possibly) migration.

We look forward to implementing this suggestion in future analyses which contain more paired spring-fall samples. Further, we believe that non-parametric models such as Fisher’s exact test rank p-value method that we implement here coupled with permutations to arrive at the null distribution is likely to be the safest approach to take. In our experience it is almost impossible to accurately model error in these datasets and one generally ends up with inflated p-values. In any case, while this is an excellent suggestion, we feel that it is outside the scope of the present manuscript.

This would be so much easier to interpret than the ad hoc analyses you carry out, and would also avoid possible biases due to arbitrary p-value cut-offs for SNPs.

We would like to point out that we generally present results across a range of p-value thresholds and ask whether the signal of generally orthogonal tests get stronger as one goes to more and more stringent cutoffs. This is a general practice in such studies and we believe it is a good practice.

Using such an approach, it should be possible to estimate what fraction of the genome changes in a non-random fashion. Less sure about the strength selection, as I think this only has meaning within an accurate demographic model (which you do not have).

Please note that our approach to estimating the fraction of the genome that is subject to seasonal fluctuations is not based on a subset of significant SNPs – rather it is based on the genome-wide distribution of a semi-parametric test-statistic. We end up with a ridge of plausible values and report it as such.

To sum up, I don't think you are doing your data justice here.

We agree that our original presentation was too convoluted to do the data justice. We hope that the revised manuscript allays these concerns.

[Editors’ note: what follows is the authors’ response to the second round of review.]

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that our decision remains the same as after the first round: you are welcome to resubmit, but dramatic changes are (still) necessary. We are sorry if we didn't manage to explain this the first time around.

Although we remain convinced that your study may contain an important and interesting result – strong selection related to season/climate on a wide geographic scale – your analyses largely obscure this. A much simpler (and shorter!) paper that focused on convincingly demonstrating this fact would be much better.

Other major problems include the "flipped" analysis, which is not only post-hoc, but serves to illustrate that you should use climate data directly since the seasons obviously aren't really seasons, and the selection simulations which are based on models that are too simplistic.

Finally, arguing that the "haphazard" sampling is a strength, is, well, not convincing.

Our recommendation remains the same: given that the sampling is haphazard, simplify both claims and analysis in order to make the former rock solid. As you know, a recent pre-print by Buffalo and Coop argues that there is no signal in your first paper: here is your chance to prove that you were right after all.

As you will see from the revised manuscript we have taken your comments to heart and significantly streamlined the paper. It is much shorter and simpler now and as you suggested is fully focused on detecting the signal of seasonal adaptation and defining the environmental factors that correlate best with the detected adaptation.

Briefly, as suggested by the reviewers we use several distinct statistical approaches to detect seasonal adaptation: (1) GLM, (2) a method based on Fisher’s exact test, and (3) Bayenv. Furthermore, we use the paired structure of our data to create appropriate null distributions via permutations of the Spring/Fall labels and retaining all other structure of the data. We show that all three approaches indeed detect a significant signal of seasonal adaptation above what we expect from permutations.

We show that the detected seasonal SNPs are significantly concordant with clinal patterns even when we detect seasonal adaptation using a set of populations in Europe and California and testing for concordance with the clinal patterns along the East Coast of North America. This avoids the possibility of shared seasonal migration patterns affecting the signal.

We have performed additional analysis studying the genomic distribution of seasonal SNPs. In contrast to the previous version of this manuscript, we now present evidence that seasonal SNPs are significantly enriched on chromosomes 2L and 3R, and are further enriched in regions associated with inversions on 2L, 3L, and 3R. We show that seasonal SNPs are significantly concordant with clinal patterns (the Fall variants tend to be more common in the North) and that this signal is found both inside and outside the inversions. The totality of the data supports the model of polygenic adaptation with a possibly more prominent role of the coadapted gene complexes associated with inversions. Although the nature of our dataset precludes us from an in-depth analysis of the identification of individual causal sites and the delineation of the causal role of specific inversions on seasonal adaptation, we believe that these results strengthen the claim that seasonal adaptation is an important feature of Drosophila populations living in temperate environments.

We finish the paper by building an environmental model that attempts to explain varying patterns of concordance of individual populations with the overall pattern of seasonal fluctuations using a leave-one-out analysis. We show that the best fit model uses the number of days in the weeks prior to spring sampling with maximum temperature above ~32°C and the number of days in the fall with minimum temperature below ~5°C. These values roughly correspond to upper and lower thermal limits for D. melanogaster physiology and the length of a single generation.

In order to streamline our message, we have removed any reference to the “flipped” model and have also removed our ABC analysis.

We believe the paper is much stronger, more succinct, and makes an important point about the speed of evolution and maintenance of genetic variation.

Reviewer #1:

This version is much better than the previous one, but I still find the analysis confused and confusing. I will comment further below, but before getting to this, I should state clearly that I know how difficult this is. We are currently struggling with analogous data from Arabidopsis, and it clear that we lack an established framework for thinking about them. This is not textbook stuff.

A related, high-level comment is how ridiculously primitive we are from the point of view of basic ecology. We have no idea how selection is working, either in time or space, much less what the main selective pressures are. You write in your introduction that you "elected to solve these two related problems by (i) working as a consortium of Drosophila biologists (DrosRTEC or the Drosophila Real Time Evolution Consortium) and (ii) sampling in a somewhat haphazard manner." The first point is good, but the second is nonsense. The solution is surely not the kind of haphazard sampling described in this paper, but rather dense and non-random sampling in both time and space. (This is needed for Arabidopsis as well, btw)

We agree that it is a hard problem and instead of arguing the point about the value of haphazard sampling we removed this claim from the paper.

When I first read this, I didn't even realize that all your samples were not collected in a single year. This raises the issue of why you don't take this into account. In our multi-year Arabidopsis field experiments, we find that there is often greater variability between years than between sites located near 1000 km from each other in a north to south direction (moving from deciduous forest to taiga).

The reviewer is correct that our dataset includes samples collected across multiple years. As we point out in the manuscript, we take the multi-year sampling into account in a number of ways:

1. In order to minimize bias caused by longer term sampling of some localities, we only use two years at most from any one sampling locality.

The 20 paired spring-fall samples were collected at 15 sampling localities.

Therefore 5 of the localities were sampled for multiple years.

2. Our GLM model includes a `population` term. In this model, we encode population as an unordered factor representing each of the 20 paired spring-fall samples. Thus, the GLM includes a “year” term, although it is embedded in a locality-by-year interaction (the population term).

3. The Bayenv model that we use would presumably capture any strong genetic differentiation between years in the genetic covariance matrix. Therefore, the signals of seasonal adaptation that we observe using the

Bayenv model should be robust to year-to-year variation.

We agree that year-to-year variation in the genetic structuring of Drosophila is likely an important aspect of the evolutionary dynamics of this species. We believe that the present manuscript will provide a good starting point for future investigation of this topic.

First, what is the temporal correlation? The effect of space is clear from you PCA, but does time explain anything? How far do these flies move?

The PCA is dominated by the spatial signal which is very robust and clearly persistent despite seasonal adaptation. We do not detect any PCs that do correlate with season and state so in the paper: “No PC is associated with season, as defined by the time of collection, following correction for multiple testing, implying that the spatial differentiation dominates patterns of SNP variation across these samples.” (line ~155)

Second, you don't use climate data except to carry out your post hoc "flipping" of fall/spring labels. Did you try using climate data to explain what you see instead of a priori notions of spring and fall that you yourself dismiss as inadequate? We found that PCAs of multi-dimensional climate data was more strongly correlated with fitness than geographic proxies.

We no longer do any flipping, and yes do build an environmental model that explains the parallelism of individual populations with the rest of the populations surprisingly well. This is a major component of the paper now and we have worked to clarify our approach in the manuscript.

We do indeed conclude that simple “spring” and “fall” labels are likely inadequate descriptors across a broad geographic range. However, we believe that it is a reasonable a priori starting point for analysis.

Another major unknown is the demographics of fruit flies. Your analyses assume (explicitly when you simulate selection) that each sampling location is a population (that dreaded population genetics abstraction that we all know doesn't really exist). You convincingly demonstrate that what you see cannot be due migration of randomly differentiated flies (because there is some kind concordance in changes globally), but this does not preclude that most of the allele frequencies be due to migration of selectively differentiated flies. Perhaps what happens is that there is selective die-off each winter, leading to cold-hardy survivors being overrepresented each spring – until they are swamped by southern migrants with vastly greater population sizes? Still selection, still adaptation, but with rather different predictions for the dynamics of allele frequencies. Talk about how fast populations adapt is less meaningful when we don't know what a population is. Returning to my initial point, to understand adaptation, we must understand the spatial and temporal scales over which allele frequencies chance. You don't have the sampling density to do this.

We completely agree that understanding the spatial and temporal scales of allele frequency change is important and that the interpretation of those data are dependent on the nature of the “population”. Factors such as extinction, migration, population growth, and deme-size (to name a few) will likely have a major impact on allele frequency change through time. While our dataset permits us the opportunity to study some aspects of these meta-population features, we are clearly limited in addressing many of these nuanced features. Indeed, we are sample limited. We now discuss what is known/unknown about the meta-population dynamics of Drosophila in the Discussion section and draw attention to this important problem, including the need for increased sample density.

Glass half full: if you really see consistent changes despite all this, there must be some BIG signals out there…

We agree.

I am convinced that you do see such changes, but my main comment remains that I think you could learn more from these data if you took several steps back and thought carefully about alternative models, and the simplest possible way to test. I suggested using the paired structure (you still need to show that pairs can be treated as independent, btw), but I think you can take it further than you have in estimating effects at individual SNPs. But it is not my job as reviewer to tell you how I would approach this, but rather to check whether I think your claims hold.

The goal of this manuscript is to provide evidence that seasonal adaptation is occurring in multiple populations across a portion of D. melanogaster’s range. We provide evidence of seasonal adaptation using three methods, one of which takes into account the genetic covariation between samples. We provide evidence that the seasonal SNPs that we observe are enriched with seasonal SNPs identified previously in Bergland et al. 2014, that patterns of seasonal adaptation are concordant across time and space, and that aspects of weather are correlated with the direction of allele frequency change. Taken together, we believe that our results go above and beyond the estimation of the effects of individual SNPs.

Point-by-point then, I am convinced by your permutations that there are large coordinated changes, but you still need to discuss the possible role of spatial and temporal correlations between pairs. Also think this point could be made better by going directly to estimates of allele frequency change, or beta from regression rather than involving p-values.

We now discuss the role of spatial and temporal correlations between pairs whenever we discuss the use of the Bayenv model.

We focus our analysis on the distribution of p-values, not betas, because p-values also include information on the precision of allele frequency change. In our experience, some sites with very large betas have noisy estimates of allele frequency (i.e., low read depth), or are at very low allele frequency. Selecting the sites with large betas (x-axis) would focus on a very different set of SNPs than focusing on sites with low p-values (y-axis). We agree that a study of the effect sizes of allele frequency change could be useful, however we would respectfully ask that this analysis wait for future analyses that incorporate more data and will allow for much greater resolution.

I find the whole "predictability" analysis convoluted and unconvincing. An attempt to shoehorn observations into an a priori framework. The leave-one-out analysis also requires some discussion of whether these points are independent, and I think it would not be necessary if you directly estimated consistent effects using permutation.

The “predictability” analysis that we applied grew out of a reasonable question that we asked as we acquired new seasonal samples: Are there predictable seasonal shifts in a new paired sample, compared to the previously acquired data? This work was generalized to the “leave-one-out” analysis for the data-set that we present in the manuscript. We agree that permutations are an important aspect of this analysis and we show that there are partially consistent effects associated with “season” using three different approaches (discussed above).

The comparison with clinal data is interesting, but again suffers from too strong priors of what is going on. You have climate data – why not simply check whether there are variables that explain both spatial and temporal shifts?

This is a completely reasonable question, and we suggest that this question is better answered in a separate study. Answering this question requires careful thought on what environmental factors to study, and how to meaningfully summarize environmental features recently felt by populations (e.g., weather as we discuss in our paper) and long-term environmental differences (e.g., climatic factors that consistently vary among populations).

As described above while we cannot make strong statements about exactly how temporal and spatial dynamics interact we do discuss multiple possibilities in the Discussion. Specifically, shared migration cannot generate the signal because we see concordance between seasonal and spatial patterns in very distantly located populations. Moreover, large scale seasonal migration is generally very unlikely given our data because populations retain their spatial identity on the spatial PCA year after year. (This is in contrast to the patterns seen in D. simulans where we do think that flies do not overwinter locally but re-migrate from more southern locations year after year and consequently lack much of the spatial structure (Machado et al. 2016)). Some more local patterns of extinction-recolonization that must involve selection in order for the seasonal patterns to be detectable are plausible but we cannot make any rigorous claims about them. We do discuss them however.

I find your analysis of the strength of selection across the genome unconvincing for the reasons outlined above: you have absolutely no idea of the demographic model, and treating each population as a close system (i.e. generating fall by sampling from spring) is simply not warranted.

We have removed the analysis of the strength and ubiquity of selection.

The analysis of which SNPs are under selection is not convincing. The complete lack of correspondence between the original and the "flipped" model strongly suggests that the peaks are not to be trusted. Here is where jack-knifing and bootstrapping might be useful: to assess the robustness of your p-value estimates. My guess is that while your data is strong enough to show that there is selection in aggregate, getting down to individual loci is not possible. This also militates against relying of high-significance filters rather than the whole distribution of effect sizes.

We removed this analysis and no longer attempt to detect individual loci under selection or to construct a flipped model. These questions will be better addressed with more dense sampling that we are carrying out in collaboration with the D. melanogaster research community.

Reviewer #2:

I have read the revised submission and the responses to review carefully. I still have major issues with the analysis done in this manuscript.

1) The authors still provide textbook case of p-hacking through their "flipped" analysis. Their justification-that the leave-one-out analysis tipped them off to a flipped effect-is not a justification at all. This entire section of the paper needs to be removed.

We respectfully disagree that this is p-hacking. Nonetheless we have removed this analysis from the paper. This made the paper much more streamlined and does not raise the issues of how such post hoc analyses ought to be done correctly.

2) The binomial GLM that the authors are doing is inappropriate, as I pointed out in the initial submission. The "paired-fashion" of sampling helps nothing-the authors are simply testing against the null of their being no-difference in frequency between seasons assuming a binomial error model. The binomial error model here is inappropriate because spring and fall are not independent draws from the same parameterized distribution-allele frequency change is expected to occur do to drift. The authors are not accounting for this and it is a major flaw. I would suggest at the very least that the authors using a Nicholson et al. type framework for accounting properly for allele frequency change between seasons.

We agree with the reviewer that the GLM model is not the perfect model to analyze pooled allele frequency change data. As we show in the supplement, for instance, the nominal p-values of the GLM are likely to be inflated. However, because we construct the null distribution using permutations that flip the Spring and Fall labels for each population we believe that our analysis is appropriate against this null. In the revised manuscript we use two alternative approaches – a Bayesian Bayenv method and the RFM method based on Fisher’s exact test – all of which detect similar strength of the seasonal signal against the permutation. We thus are fully confident that our analysis is robust.

3) The justification that the authors give about "haphazard sampling" is risible. Adding noise does not add "inferential power" as the authors claim on line 174

We removed this argument from the manuscript.

4) The issue of biased ascertainment of SNPs has not been dealt with. The authors simply give their same estimates of genome-wide numbers of SNPs affected by seasonal selection and then follow it with a caveat. All such genome-wide estimates should be removed-you can't estimate them given your ascertainment conditions. Moreover you say on line 235 "Whether this SNP selection process generates bias in our estimates […] remains to be determined." This is unacceptable- your ascertainment definitely creates bias- this language brushes that under the rug.

We have removed this analysis from the manuscript.

5) The code for the permutation routines and the control SNP matching needs to be shared.

Code for all analyses can be found at https://github.com/machadoheather/dmel_seasonal_RTEC.

The code to run the permutations can be found below.

GLM:

https://github.com/machadoheather/dmel_seasonal_RTEC/seasonal_glm/ RFM:

https://github.com/machadoheather/dmel_seasonal_RTEC/seasonal_rfm/

Bayenv:

https://github.com/machadoheather/dmel_seasonal_RTEC/seasonal_baye

The code for the control SNP matching for the latitudinal and seasonal concordance can be found here:

https://github.com/machadoheather/dmel_seasonal_RTEC/create_input_fil

6) Lines 293-295 The authors are reporting that permution suggests the p-value of enrichment > 0.1. This suggests chance and nothing else is responsible for the observed effect despite the authors' conclusion of "robust evidence that parallele seasonal adaptation is a general feature…"

We have changed this section entirely. We now report three different statistical methods and detect seasonal adaptation robustly.

7) Line 309-no enrichment test has been described.

This section has been deleted.

8) Line 330-Citing Gould here is silly. Pick a more appropriate citation.

This section has been deleted.

9) Supplemental Figure 7-in the Bayesenv analysis that I asked for the "All" curve looks very different and goes against the conclusion that the authors are making. No explanation is given.

The differences are present only at low p-values where there are few SNPs. We have updated the figure with error bars to reflect this.

10) The ABC analysis, now described, is not using proper population genetic simulations of allele frequency change per generation due to drift + selection. As written there is only one generation of drift. This needs to be changed to take this analysis even moderately seriously. Moreover the code for these simulations needs to be shared.

We have removed this analysis from the manuscript.

11) Line 1020-In the corcordance regression the authors are doing a binned regression-this is never appropriate and the authors need to redo this analysis without binning.

We agree that the binned regression would not be appropriate if we were solely interested in the statistical significance of the slope. However, we use the slope as a summary statistic to quantify the rate of change of concordance across the bulk of the genome.

Reviewer #3:

Overall this is an improved manuscript. Easier to read and follow, and better explained. There are several points I think that should still be addressed.

I am still not a fan of the flipped model. I agree that some of the evidence (predicting into the validation set, etc.) does indeed argue it's a better fit, but it still feels like ad-hoc subjective tweaking of the data until it fits well. I would prefer it to be removed from the paper -- I think show the original model and point out that some population show the reverse pattern and that matches with temperature. Perhaps even include the flipped model in the supplement. I would find that more convincing than the flipped model I think. In either case, the paragraph starting on line 445 should go, as even the authors admit this doesn't really show anything meaningful. The flipped model should also be removed from figures 2A and 2B as again it will by definition show a more convincing signal here.

We agree and have removed the flipped model. We hope you will find the analysis crisper and easier to follow.

As an alternative to the flipped model or presenting the data with the course labels of spring and fall, why not actually model the temperature data available? It would seem an objective a priori approach that should allow for differences in the flipped populations (i.e. presumably the difference in temperature the 3 weeks prior between Fall and Spring behaves differently for the flipped populations). Perhaps use mean temperature in the 3 weeks prior and/or the slope of the change of temperature over that time? I'm sure there are more creative/intelligent options, but I don't quite understand why the authors can't use this data instead of grossly categorizing things as spring or fall. I didn't see a good reason for not doing so in the response?

We approached the analysis of this dataset with a clear hypothesis that flies collected in spring (or fall), across a portion of the species range spanning and spanning two continents, would show allele frequency shifts at a common set of loci. While we find broad support for this hypothesis, we also find that it is not the whole story and provide evidence that aspects of weather in the weeks prior to sampling are associated with the direction of allele frequency change at a subset of loci. We agree with the reviewer that there are a number of different ways of characterizing environmental change across space and time. Reviewer 1 brought up a similar point. Indeed, there are an infinite number of ways of characterizing the environment. We explored a simple characterization (spring and fall), and extended this analysis to include some aspects of temperature as i) we have data about it, ii) it tends to be correlated with many other aspects of the environment, and iii) we have prior knowledge that temperature fluctuations of the kinds we consider are physiologically and developmentally relevant to D. melanogaster. We look forward to investigating the question of the environmental drivers of seasonal adaptation in future work that uses much more dense temporal and spatial sampling of flies. Importantly, we investigated a range of models and showed that only one set of models can explain the data. In addition, none of the nuisance variables – such as contamination with the D. simulans reads, or collection method, or geographic location – are correlated with the concordance statistic.

I find the authors treatment of enrichment odd. In some places it is presented as convincing evidence, and in others (line 563) it is disregarded because of absolute numbers. The logic on line 563 is fine of course, but I would like to see enrichment treated the same way throughout. On line 477 it is convincing as a log odds score, and in the paragraph starting on line 318 a modest percentage enrichment is considered good evidence.

We have streamlined the enrichment analyses and now present enrichment results solely for overlap with previously identified seasonal SNPs: “Top 1% of SNPs identified using the GLM overlap with SNPs identified by Bergland et al. (2014) slightly more than expected relative to matched genomic controls after re-identifying seasonal SNPs in the Core20 set excluding the overlapping Pennsylvanian populations (log2 odds ratio ± SD = 0.59 ± 0.37, pperm = 0.0512).”

I'd like to see a bit more exploration of the clustering. Figure 5D (visually) and the 100kb window analysis seem to suggest that clustering is on a relatively large scale, yet the analysis presented on 629 for % genome and s only investigates 5kb windows. If I'm understanding the ABC correctly it should be pretty fast to run, and it seems like running it on 50, 100, or even 500kb scale might be of interest. (To my eye some of the figures in S10 start to suggest a flattening of the ridge when done at 5kb scale). Certainly the data do appear to argue for a polygenic architecture, but whether this is ~50 windows or 5% of the genome I think isn't well differentiated.

We have removed this analysis from the manuscript.

Line 800: I agree with this logic about temperature and why some populations behave differently. I would have liked to see this prediction about temperature earlier in the introduction. Naively my first impression was that Fall populations would be adapted to cooler conditions and Spring to warmer. I see now why that is wrong, but I think stating up front that Fall populations are expected to reflect adaptation to warm summers would help some readers.

We now explain in the introduction that flies collected in spring are the descendants of flies who survived winter, and that flies collected in fall are the descendants of those lineages that prospered during the summer.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Almost there! As you will see from the comments below, we got a fresh 3rd reviewer, who picked up on something we agree should be addressed, namely the likely role of inversions. We have two suggestions for how to proceed. Either: a) go through the manuscript and make sure you emphasize that much of the signal may be driven by inversions, and that it is impossible to know how polygenic this really is, or; b) provide additional analyses (e.g., of the X chromosome), to demonstrate that there is a signal independent of inversions.

We have edited the text to emphasize that the signal may be driven largely by inversions and that the extent of polygenicity is difficult to ascertain with the current dataset.

Reviewer #3:The strongest seasonal signal comes from inversions. If inversions are responding to seasonal selection, it is not surprising that the authors find parallel SNP changes across populations as the same inversions are shared globally. Unless the authors refocus the manuscript on parallel selection on inversions, their analyses need to be modified: almost all analyses use the full SNP set, but to study real parallel selection responses on the SNP level, the authors need to restrict their analysis on SNPs, which are not affected by inversions. To this end, it is important to keep in mind that inversions may suppress recombination also outside of the inversion, which makes it a bit challenging to determine the autosomal fraction that is not affected by inversions. A much better strategy would be to analyze the X-chromosome, which is the only major chromosome free of inversions. Unfortunately, the authors excluded this chromosome from their analyses.

We appreciate the reviewer’s suggestion about analyzing the X-chromosome. We excluded the X-chromosome in our analysis because we sequenced males, and thus have reduced power to detect seasonal allele frequency shifts. The reduction of power on the X compared to the autosomes would complicate any direct comparison. Therefore, we have taken your suggestion (and that of the editors) and refocused the manuscript on parallel selection on inversions.

Line 108: removed the word “polygenic”.

Lines 207-210: added statistics on the seasonal enrichment outside inversions.

Line 349: Removed the claim: “While we cannot determine the precise number and identity of causal sites in the present study, the totality of the evidence is consistent with the polygenic nature of adaptation.”

Lines 352-355: comment on the power to detect parallel seasonal evolution in low recombination regions, such as inversions.

Lines 359: changed “strongly polygenic” to “complicated architecture”.

Line 412: added mention of selection on inversions obscuring the estimate of number of seasonal loci.

Lines 420-422: added “While given our dataset it is impossible to know the true extent of polygenicity, our results suggest that seasonal adaptation acts on at least several but potentially a large number of dispersed loci affecting linked variation genome-wide.”

Line 426. changed the claim of seasonal adaptation being polygenic to a claim of seasonal adaptation at common SNPs.

Title: changed to “Broad geographic sampling reveals the shared basis and environmental correlates of seasonal adaptation in Drosophila

Anyway, inspection of Figure 2D shows that the signal for seasonal SNPs is erased for regions outside of the inversions. Furthermore, a significant concordance pattern between seasonal and clinal SNPs outside of the inversion is restricted to 2L and 3R, the chromosomes with the strongest inversion effects. This could be interpreted as an effect of inversions on the genomic regions flanking the inversion.

The reviewer is correct that seasonal SNPs do not appear to be enriched outside of regions surrounding inversion breakpoints or inside of inversions on each of the autosomal arms (see added statistics, lines 207-210). As we pointed out above, we have refocused our manuscript to highlight the likely role of inversions associated with seasonal adaptation in D. melanogaster. However, we caution that inversions are not solely responsible for seasonal adaptation and speculate that the true mechanistic dynamics are likely more complicated.

How do the authors interpret a (presumably significant) underrepresentation of concordance SNPs on 3L?

The under-enrichment in concordance of seasonal and clinal polymorphisms outside of the inversion on 3L is not statistically significant after multiple testing correction (FDR=0.69). However, for the sake of speculation, it is possible that counter gradient evolution can occur at certain SNPs, wherein alleles that are beneficial over the winter are also higher frequency in the south compared to the north. Future studies with larger sample sizes will be necessary to assess whether there truly is a consistent inverted pattern in this genomic region.

Apart from my doubts about the significance of the seasonal selection signal, I would like to come back to the novel aspect of the manuscript-sharing seasonal SNPs across populations. The authors highlight, probably correctly, that seasonal adaptation is polygenic. This raises the question of whether parallel selection signatures are expected in differentiated populations. In my opinion two lines of reasoning speak against it: 1) probably more variants are segregating in the populations than required for seasonal adaptation (redundancy) 2) the frequencies of the seasonal SNPs most likely differ between the populations. Hence, SNPs closer to 50% are expected to respond more to the same selection pressure than SNPs with more extreme allele frequencies. This will lead to different power to detect the same selection response in differentiated populations.

These are completely reasonable points. In the discussion, we now highlight how redundancy could affect the seasonal signal that we observe. We believe that point #2 is reasonable but addressing this question rigorously is outside of the scope of the present manuscript.

Analyze the X-chromosome.

Please see the comments above.

Remove the second season from the locations where two spring-fall pairs were included-only this makes the comparison unbiased.

While we appreciate the concern here, we think that it is unfounded. As shown in Bergland et al. 2014, population differentiation between years is as strong – on average – as populations separated by 5° latitude. Although each locality in our sample seems to represent a stable population (i.e., no mass extirpation), it is not clear that having two years of samples from a single locality is any more biased than having seasonal samples from neighboring localities (e.g., Linvilla PA and State College PA). In addition, we show that there is a significant excess of seasonal sites using the BayEnv model which accounts for genetic covariance among samples.

Evaluate whether the spring-fall permutations remove the statistical issues of the GLM and Fisher's exact tests mentioned by the previous reviewers. Clarify that the matched controls were done on a sample basis, rather than across samples.

The 100 matched control datasets were generated with the characteristics of each SNP – not by using any single-population characteristics. One potential reason this could have been misinterpreted is that one of the matching characteristics is median spring allele frequency; this being the median across populations. We have clarified this in the methods (lines 585-586).

Clarify that the effective coverage was calculated per SNP.

We have clarified this in the methods (line 531).

The authors cite theoretical work, which suggests that seasonal SNPs may be maintained for highly restricted conditions (changing dominance)-do they find empirical support that these conditions are met in their data?

We are unable to test for changes in dominance coefficients between seasons using the data that we present in this manuscript.

The significance of the manuscript to a broader audience could be increased by:

– A statement that the seasonal selection response is restricted to inversions-but I doubt that this is the message the authors would like to portray.

Please see the comments above.

– A general discussion about the expectations of parallel selection signatures on the SNP level across populations and why the authors expect to see it (or find it against the expectations).

Please see the comments above.

https://doi.org/10.7554/eLife.67577.sa2

Article and author information

Author details

  1. Heather E Machado

    1. Department of Biology, Stanford University, Stanford, United States
    2. Wellcome Sanger Institute, Hinxton, United Kingdom
    Contribution
    Conceptualization, Data curation, Formal analysis, Investigation, Visualization, Methodology, Writing - original draft, Writing - review and editing
    Contributed equally with
    Alan O Bergland
    For correspondence
    heather.machado@sanger.ac.uk
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1523-3937
  2. Alan O Bergland

    1. Department of Biology, Stanford University, Stanford, United States
    2. Department of Biology, University of Virginia, Charlottesville, United States
    Contribution
    Conceptualization, Resources, Formal analysis, Funding acquisition, Investigation, Visualization, Methodology, Writing - original draft, Project administration, Writing - review and editing
    Contributed equally with
    Heather E Machado
    For correspondence
    aob2x@virginia.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7145-7575
  3. Ryan Taylor

    Department of Biology, Stanford University, Stanford, United States
    Contribution
    Data curation, Formal analysis
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9003-6378
  4. Susanne Tilk

    Department of Biology, Stanford University, Stanford, United States
    Contribution
    Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-9156-9360
  5. Emily Behrman

    Department of Biology, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-2472-9635
  6. Kelly Dyer

    Department of Genetics, University of Georgia, Athens, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  7. Daniel K Fabian

    1. Institute of Population Genetics, Vetmeduni Vienna, Vienna, Austria
    2. Centre for Pathogen Evolution, Department of Zoology, University of Cambridge, Cambridge, United Kingdom
    Contribution
    Resources
    Competing interests
    No competing interests declared
  8. Thomas Flatt

    1. Institute of Population Genetics, Vetmeduni Vienna, Vienna, Austria
    2. Department of Biology, University of Fribourg, Fribourg, Switzerland
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5990-1503
  9. Josefa González

    Institute of Evolutionary Biology, CSIC- Universitat Pompeu Fabra, Barcelona, Spain
    Contribution
    Resources
    Competing interests
    No competing interests declared
  10. Talia L Karasov

    Department of Biology, University of Utah, Salt Lake City, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  11. Bernard Kim

    Department of Biology, Stanford University, Stanford, United States
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-5025-1292
  12. Iryna Kozeretska

    1. Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
    2. National Antarctic Scientific Centre of Ukraine, Taras Shevchenko Blvd., Kyiv, Ukraine
    Contribution
    Resources
    Competing interests
    No competing interests declared
  13. Brian P Lazzaro

    Department of Entomology, Cornell University, Ithaca, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  14. Thomas JS Merritt

    Department of Chemistry & Biochemistry, Laurentian University, Sudbury, Canada
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4795-7534
  15. John E Pool

    Laboratory of Genetics, University of Wisconsin-Madison, Madison, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  16. Katherine O'Brien

    Department of Biology, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-4660-0338
  17. Subhash Rajpurohit

    Department of Biology, University of Pennsylvania, Philadelphia, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  18. Paula R Roy

    Department of Ecology and Evolutionary Biology, University of Kansas, Lawrence, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  19. Stephen W Schaeffer

    Department of Biology, The Pennsylvania State University, University Park, United States
    Contribution
    Resources
    Competing interests
    No competing interests declared
  20. Svitlana Serga

    1. Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
    2. National Antarctic Scientific Centre of Ukraine, Taras Shevchenko Blvd., Kyiv, Ukraine
    Contribution
    Resources
    Competing interests
    No competing interests declared
  21. Paul Schmidt

    Department of Biology, University of Pennsylvania, Philadelphia, United States
    Contribution
    Conceptualization, Formal analysis, Investigation, Resources, Funding acquisition, Writing - original draft, Project administration, Writing - review and editing
    Contributed equally with
    Dmitri A Petrov
    For correspondence
    schmidtp@upenn.edu
    Competing interests
    No competing interests declared
  22. Dmitri A Petrov

    Department of Biology, Stanford University, Stanford, United States
    Contribution
    Conceptualization, Supervision, Formal analysis, Investigation, Funding acquisition, Writing - original draft, Project administration, Writing - review and editing
    Contributed equally with
    Paul Schmidt
    For correspondence
    dpetrov@stanford.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-3664-9130

Funding

NIH Office of the Director (R01GM100366)

  • Paul Schmidt
  • Dmitri A Petrov

NIH Office of the Director (R35GM118165)

  • Dmitri A Petrov

NIH Office of the Director (R01GM137430)

  • Paul Schmidt

NIH Office of the Director (F32GM097837 R35GM119686)

  • Alan O Bergland

European Commission (H2020-ERC-2014-CoG-647900)

  • Josefa González

Natural Sciences and Engineering Research Council of Canada (RGPIN-2018-05551)

  • Thomas Merritt

Canada Research Chairs (950-230113)

  • Thomas Merritt

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank the National Evolutionary Synthesis Center (NESCent) for sponsoring the 2012 Catalysis meeting that initiated the Drosophila Real Time Evolution Consortium. The meeting was attended by Alan Bergland, Alisa Sedghifar, Brian Helmuth, Brian Lazzarro, Chau-Ti Ting, David Kidd, Dmitri Petrov, Fabian Staubach, Hannah Burrack, Jim Fry, John Lessard, John Coulbourne, John Pool, Josefa Gonzalez, Julien Ayroles, Kelly Dyer, Kim Hughes, Maaria Kankare, Nadia Singh, Paul Schmidt, Regan Early, Stephen Porder, Subhash Rajpurohit, Sui Fai Lee, and Thomas Flatt, and we kindly thank all participants for their participation. We also thank Jamie Blundell and all members of the Schmidt and Petrov labs who provided exceptionally valuable feedback.

Senior Editor

  1. Patricia J Wittkopp, University of Michigan, United States

Reviewing Editor

  1. Magnus Nordborg, Austrian Academy of Sciences, Austria

Reviewer

  1. Magnus Nordborg, Austrian Academy of Sciences, Austria

Publication history

  1. Received: February 16, 2021
  2. Accepted: June 21, 2021
  3. Accepted Manuscript published: June 22, 2021 (version 1)
  4. Version of Record published: July 1, 2021 (version 2)

Copyright

© 2021, Machado et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,076
    Page views
  • 205
    Downloads
  • 3
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Ecology
    Claudia Zeiträg, Ivo Jacobs
    Insight

    Eurasian jays fail to take into account the point of view and desire of other jays when hiding food they can eat later.

    1. Ecology
    2. Microbiology and Infectious Disease
    Matt Lloyd Jones et al.
    Research Article

    Common garden experiments that inoculate a standardised growth medium with synthetic microbial communities (i.e. constructed from individual isolates or using dilution cultures) suggest that the ability of the community to resist invasions by additional microbial taxa can be predicted by the overall community productivity (broadly defined as cumulative cell density and/or growth rate). However, to the best of our knowledge, no common garden study has yet investigated the relationship between microbial community composition and invasion resistance in microcosms whose compositional differences reflect natural, rather than laboratory-designed, variation. We conducted experimental invasions of two bacterial strains (Pseudomonas fluorescens and Pseudomonas putida) into laboratory microcosms inoculated with 680 different mixtures of bacteria derived from naturally occurring microbial communities collected in the field. Using 16S rRNA gene amplicon sequencing to characterise microcosm starting composition, and high-throughput assays of community phenotypes including productivity and invader survival, we determined that productivity is a key predictor of invasion resistance in natural microbial communities, substantially mediating the effect of composition on invasion resistance. The results suggest that similar general principles govern invasion in artificial and natural communities, and that factors affecting resident community productivity should be a focal point for future microbial invasion experiments.