An oligogenic architecture underlying ecological and reproductive divergence in sympatric populations
Abstract
The evolutionary trajectories and genetic architectures underlying ecological divergence with gene flow are poorly understood. Sympatric timing types of the intertidal insect Clunio marinus (Diptera) from Roscoff (France) differ in lunar reproductive timing. One type reproduces at full moon, the other at new moon, controlled by a circalunar clock of yet unknown molecular nature. Lunar reproductive timing is a magic trait for a sympatric speciation process, as it is both ecologically relevant and entails assortative mating. Here, we show that the difference in reproductive timing is controlled by at least four quantitative trait loci (QTL) on three different chromosomes. They are partly associated with complex inversions, but differentiation of the inversion haplotypes cannot explain the different phenotypes. The most differentiated locus in the entire genome, with QTL support, is the period locus, implying that this gene could not only be involved in circadian timing but also in lunar timing. Our data indicate that magic traits can be based on an oligogenic architecture and can be maintained by selection on several unlinked loci.
Editor's evaluation
This important article provides solid evidence for an apparent oligogenic architecture of an ecologically relevant trait, the circalunar reproduction of marine midges, which contributes to assortative mating, is likely under divergent selection, and supports reproductive isolation in sympatry.
https://doi.org/10.7554/eLife.82825.sa0Introduction
Evolutionary biology has seen a long debate on whether speciation can only happen with geographic isolation (allopatry) or also with full range overlap (sympatry) Mayr, 1947; Smith, 1966; Via, 2001; Foote, 2018; Coyne and Orr, 2004. Today, allopatric and sympatric speciation are viewed as the two extremes on a multifaceted continuum of speciation with gene flow, for which there is growing evidence Foote, 2018; Richards et al., 2019. Speciation with gene flow is generally assumed to start with divergent ecological selection, but for speciation to complete, additional components of reproductive isolation must come into play Smadja and Butlin, 2011; Seehausen et al., 2014. Traits underlying ecological divergence and other components of reproductive isolation must be genetically coupled, as otherwise they cannot withstand the homogenizing effects of gene flow and recombination Smadja and Butlin, 2011; Felsenstein, 1981. Such coupling can be achieved via pleiotropy, magic traits, or genetic linkage between respective genes. Pleiotropy describes the situation where ‘one allele affects two or more traits’ Barton et al., 2007 contributing to reproductive isolation. Here, the coupling can be achieved via a single genetic locus. A ‘magic trait’ Gavrilets, 2004 in turn is a single trait which affects several components of reproductive isolation at the same time Smadja and Butlin, 2011; Servedio et al., 2011. It is therefore also referred to as a ‘multiple-effect trait’ Smadja and Butlin, 2011. The concept does per se not include an assumption on the genetic basis of the trait, and should therefore not be equated with pleiotropy. Classically, magic traits are thought to affect ecological divergence and assortative mating at the same time. Of particular interest to our study is the scenario where divergent ecological adaptations lead to differences in reproductive timing, that is allochrony Taylor and Friesen, 2017 or isolation by time Hendry and Day, 2005. One such example is found in the apple maggot Rhagoletis Filchak et al., 2000; Doellman et al., 2018. Finally, genetic linkage imposed by genomic regions of low or suppressed recombination, such as chromosomal inversions, can entail a coupling of ecological and reproductive traits even if they are controlled by different sets of genes Butlin, 2005. As an example, a chromosomal inversion in the monkey flower Mimulus guttatus was found to contribute to ecological adaptation and reproductive isolation Lowry and Willis, 2010. Other examples of inversions associated with ecologically relevant traits are found in Littorina snails Koch et al., 2021; Faria et al., 2019 and Heliconius butterflies Joron et al., 2011. In this study, we assessed if magic traits or genetic linkage play a role in sympatric population divergence in the marine midge Clunio marinus (Diptera: Chironomidae).
C. marinus is found in the intertidal zone of the European Atlantic coast. Larvae live at the lower levels of the intertidal, which are almost permanently submerged. For successful reproduction, the adults need these regions to be exposed by the low tide. The lowest tides predictably recur just after new moon and full moon. Therefore, C. marinus adults emerge only during full moon or new moon low tides, reproduce immediately and die a few hours later in the rising tide. This life cycle adaptation is based on a circalunar clock, that is an endogenous time-keeping mechanism which synchronizes development and maturation with lunar phase. Circalunar clocks are common in marine organisms, but their molecular basis is still unknown Kaiser and Neumann, 2021. Additionally, a circadian clock ensures that C. marinus adults only emerge during the low tide Neumann, 1967. As the pattern and amplitude of the tides differ dramatically along the coastline, C. marinus populations from different geographic sites show various genetic adaptations in circadian and circalunar timing Neumann, 1967; Kaiser, 2014; Kaiser et al., 2011.
Notably, low tide water levels follow a bimodal distribution across a lunar month, with minima at both full moon and new moon. These minima are ecologically equally suitable for C. marinus’ reproduction, representing different timing niches that are occupied by different timing types (for details and definitions see Kaiser et al., 2021). Some timing types of C. marinus use both niches (‘semi-lunar rhythm’; SL type), but there are also dedicated full moon (FM) or new moon (NM) timing types, which occupy only one timing niche Kaiser et al., 2021. As water levels outside the minima are much less suitable for successful reproduction, we can expect divergent selection toward either full moon or new moon emergence. This divergence into FM and NM timing types represents a magic trait, as it automatically entails assortative mating. Additionally, hybrids between timing types are expected to emerge at intermediate times with respect to their parents Neumann, 1967; Kaiser et al., 2011, that is during neap tide high tides. As this tidal situation is unfavorable for C. marinus’ reproduction, we can expect selection against hybrids, a form of an ecologically imposed postzygotic incompatibility. We recently discovered that in Roscoff (France) FM and NM timing types occur in sympatry, largely separated by reproductive timing, but still connected by gene flow Kaiser et al., 2021. Here, we present evidence for polymorphic chromosomal inversions in all chromosome arms of the sympatric FM and NM types. Quantitative trait loci (QTL) mapping for divergence in lunar reproductive timing identifies four unlinked and additive QTL in different chromosomes and inversions. Individual loci inside the inversions are more differentiated than the inversions, suggesting that the inversions themselves are not directly required to maintain linkage disequilibrium between adaptive loci and that ecological divergence is kept up by permanent selection on unlinked loci on different chromosomes.
Results
Genome sequencing confirms limited genetic differentiation between the sympatric FM and NM types
In order to gain insights into the loci and processes underlying sympatric divergence in reproductive timing, we sequenced 48 individual genomes for the FM and NM timing types (24 individuals each, defined by their lunar timing phenotype; genome size: 79.4 Mbp; average read coverage: 20 x). The resulting set of 721,000 genetic variants (608,599 SNPs; 112,401 small indels) indicated limited genetic differentiation between the FM and NM types (Figure 1; weighted genome-wide FST = 0.028). Principal component analysis (PCA) and ADMIXTURE identify one individual as a migrant in time, which was caught at full moon among FM type individuals, but genetically clearly is of NM type (Figure 1A and B; blue circle and arrow). Many other individuals, particularly in the NM type, show ADMIXTURE fractions close to 0.5 or 0.25, suggesting they are F1 hybrids or backcrosses (Figure 1A and B; yellow circle and arrows). Finally, there are four FM individuals, which are genetically distinct along principal component 2 and in ADMIXTURE at K=4 (Figure 1A and B; red circle and arrows). These might constitute either a sub-lineage of the FM type or migrants from a different geographic location. Taken together, full genome resequencing confirms that there is considerable hybridization of sympatric timing types and overall low genome-wide differentiation.

Population structure of the FM and NM types in Roscoff based on 721,000 genetic variants.
(A, B) Principal component analysis (PCA; A) and admixture analysis (B) identify one migrant in time (blue; pure NM genotype caught at full moon), many potential F1 hybrids (yellow) and four individuals in the FM strain that appear genetically distict from all other samples (red). (C) Global genetic differentiation is limited, but there is a block of strong differentiation on chromosome 1.
A differentiated chromosomal inversion system on chromosome 1
Plotting genetic differentiation of the phenotypically defined FM and NM types along the genome (Figure 1C) revealed a region of elevated FST on the telocentric chromosome 1 (Figure 2A). This region coincides with a block of long-range linkage-disequilibrium (LD) in the FM type (Figure 2B; Figure 2—figure supplement 1). Structural variant (SV) calling based on additional long-read sequencing data supports that this block represents a chromosomal inversion polymorphic in both timing types, hereby called In(1a) (Figure 2C; full dataset in Supplementary file 1). Notably, the NM type shows a smaller LD block (Figure 2B; Figure 2—figure supplement 1), suggesting that a second structural variant occurs within the limits of the detected large chromosomal inversion. While this second variant was not picked up in SV calling, genetic linkage information obtained from crosses (described in detail below) indicates that this is a second inversion, named In(1b) (see double inverted marker order in Figure 5A and B).

A complex inversion system on chromosome 1.
(A) Chromosome 1 harbors a block of markedly elevated genetic differentiation. (B) This genomic block coincides with two windows of elevated long-range LD in the FM and NM strains. (C) Structural variant (SV) calling from long read sequencing data supports that the larger LD block is due to an inversion. (D–G) Principal component analysis (PCA) for chromosomal sub-windows corresponding to suggested inversions separates the individuals into clusters corresponding to standard haplotype homozygotes (SS), inversion homozygotes (I1I1, I2I2) and inversion heterozygotes (SI1, SI2, SI3, I1I2). The four individuals which were already found distinct in whole genome analysis (red question mark) cannot be assessed. (H–K) Observed heterozygosity is clearly elevated in inversion heterozygotes, underpinning substantial genetic differentiation between the inversions. (L) A schematic overview of the sequence of inversion events and the resulting haplotypes. The frequencies of the haplotypes differ markedly between the FM and NM types.
For a more detailed analysis of this inversion system, we subdivided it into three windows, based on the inversion coordinates suggested by long-range LD (Supplementary file 2; Supplementary file 3). Two windows correspond to those parts of the large inversion that do not overlap with the small inversion (roughly 2.7–4.4 Mbp and 11.3–18.7 Mbp; Supplementary file 2; Figure 2A–C). The central window corresponds to the overlap of both inversions (roughly 4.4–11.3 Mbp; Supplementary file 2; Figure 2A–C). In a principal component analysis (PCA) on the genetic variants in these windows (Supplementary file 3), the four FM individuals that were already identified as genetically distinct (Figure 1) always cluster separately and cannot be assessed (Figure 2D-G, red question mark). The remaining 44 individuals are split into three genotype classes when PCA is performed on the non-overlapping regions of the large inversion (Figure 2D and F). These classes correspond to individuals homozygous for the standard haplotype (SS), individuals homozygous for an inverted haplotype (I1I1), and heterozygotes for the inversion (SI1). Genotype assignments are confirmed by local admixture analysis (Figure 2—figure supplement 2), as well as heterozygosity (Figure 2H–K). Local observed heterozygosity is clearly elevated in individuals heterozygous for the inversion (Figure 2H and J), suggesting there are polymorphisms specific to the S or I1 haplotypes. This is corroborated by the patterns of genetic differentiation between SS and I1I1 homozygotes, which show FST values of up to 1 (Figure 2—figure supplement 3). For the homozygous genotypes, the haplotype which has the higher observed heterozygosity in homozygotes is considered the ancestral standard haplotype S (see Figure 2H–J).
In the region where both inversions overlap (Figure 2E), the 44 individuals separate into six genotype clusters, corresponding to three clusters of inversion haplotype homozygotes (SS, I1I1, I2I2) and three clusters of heterozygotes (SI1, SI2, I1I2). Individuals that are I1I1 genotypes in the 2.7–4.4 Mbp and 11.3–18.7 Mbp windows (Figure 2D and F) now separate into the I1I1, I1I2 and I2I2 clusters (Figure 2E). This indicates that the small inversion leading to haplotype I2 happened in the already inverted haplotype I1 of the large inversion (compare Figure 2L). This is backed up by genetic linkage data (Figure 5A and B) and consistent with I2I2 homozygotes showing even lower heterozygosity than I1I1 homozygotes (Figure 2I). Again, heterozygosity is clearly elevated in the inversion heterozygotes (Figure 2I). A sliding-window PCA analysis along the chromosomes (500 kb windows, 100 kb steps; Figure 2—figure supplements 4–21) confirms these patterns of segregation and the approximate genomic positions of inversion breakpoints.
Sliding-window PCA also indicated that there might be a third inversion, which ends at 24.8 Mbp (Figure 2A and G; Figure 2—figure supplements 4–21). Its start overlaps with the differentiated inversion system and is therefore hard to identify, but according to observed heterozygosity along the chromosome, it might lie at around 9 Mbp (Figure 2—figure supplement 22A). This inversion is only clearly detected in three SI3 heterozygotes (Figure 2G and K). These three individuals are found in the SS cluster in the other genomic windows, implying that the inversion leading to haplotype I3 happened in the standard haplotype S (Figure 2L).
One peculiarity deserves further investigation: In the window from 2.7 to 4.4 Mbp some individuals of the I1I1 genotype are found close to the SI1 heterozygotes. However, the pattern in which these individuals segregate in the 4.4–11.3 Mbp window unequivocally indicates that these must be I1I1 homozygotes. They also show much lower heterozygosity than the SI1 heterozygotes. Their peculiar clustering may indicate a certain degree of gene conversion between inversion haplotypes or a complex demographic history.
Taken together, our data support three inversion events, which we name In(1a), In(1b) and In(1c), leading to four distinct haplotypes of chromosome 1 (S, I1, I2 and I3; Figure 2L). From the PCA-derived genotypes (Figure 2D–G), we can infer the frequency of each haplotype (Figure 2L). There is marked genetic differentiation between the sympatric timing types. The FM type carries all haplotypes but is dominated by the S haplotype, while the NM type almost exclusively carries the inverted I1 and I2 haplotypes. The distribution of haplotypes also explains the observed patterns of LD. As the NM type is segregating almost exclusively for I1 and I2, it shows recombination suppression only over the length of I2. The FM type is segregating for the S haplotype vs both I1 and I2, and thus shows recombination suppression along I1.
Non-differentiated inversions on chromosomes 2 and 3
The analysis of long-range LD also identifies one putative chromosomal inversion on each arm of the metacentric chromosomes 2 and 3 (Figure 3B and C; Figure 3—figure supplement 1). We call them In(2L), In(2R), In(3L) and In(3R). In(2L) is confirmed by inverted marker order in linkage mapping (Figure 5A and B) and In(2R) is confirmed by SV calling (Supplementary file 1). As above, windowed PCA along the chromosomes confirms the approximate inversion breakpoints and allows to genotype the individuals for the inversion haplotypes (Figure 3D–G). Heterozygosity is clearly elevated in inversion heterozygotes (SI; Figure 3H–K) and lowered in the inversion homozygotes (II; Figure 3H–K). Notably, inversion homozygotes are absent for two of these inversions (Figure 3D and F) and very rare for the other two (Figure 3E and G), suggesting these inversions might impose some fitness constraints. These four inversions are only weakly differentiated between populations (Figure 3L–O). Only In(2L) shows up as a block of mildly elevated genetic differentiation between the strains (Figure 3A).

Inversions in chromosomes 2 and 3.
(A) Chromosome arm 2L harbors a block of mildly elevated genetic differentiation. (B) Several blocks of long-range linkage disequilibrium (LD) point to additional inversions in the FM and NM strains, one on each chromosome arm. (C) Structural variant (SV) calling from long read sequencing data supports the inversion on chromosome arm 2R. (D–G) Principal component analysis (PCA) for chromosomal sub-windows corresponding to suggested inversions separates the individuals into clusters corresponding to standard haplotype homozygotes (SS), inversion homozygotes (II) and inversion heterozygotes (SI). The four individuals which were already found distinct in whole genome analysis (red question mark) cannot be assessed. (H–K) Observed heterozygosity is clearly elevated in inversion heterozygotes, underpinning substantial genetic differentiation between the inversions. (L–O) The frequencies of the haplotypes do not differ much between the FM and NM types.
Crosses between the FM and NM types indicate that lunar reproductive timing is heritable and controlled by at least four QTL
Next, we tested for a genetic basis of lunar reproductive timing by performing crossing experiments between the FM and NM types (Figure 4). F1 hybrids emerge at an intermediate time point between the FM and NM type emergence times, with a slight shift toward FM emergence (Figure 4C). In the F2 generation, the phenotype distribution is spread out, but does not fully segregate into parental and F1 phenotype classes (Figure 4D). This indicates that the difference in lunar reproductive timing is controlled by more than one genetic locus. From the crossing experiment, we picked a set of several F2 families that together comprised 158 individuals, which all go back to a single parental pair – a FM type mother and a NM type father. Because of the limited genetic differentiation between the timing types (see Figure 1C and Kaiser et al., 2021), we sequenced the full genomes of the two parents in order to identify informative genetic markers for linkage mapping and QTL mapping. We picked a set of 32 microsatellite markers, 23 of which turned out to be reliably amplified and informative in the F2 families, as well as four insertion-deletion polymorphisms (Supplementary file 4).

Lunar emergence time is heritable.
In a cross between FM type (A) and NM type (B), the F1 hybrids emerges intermediate between the parents (C). In the F2 (D), the phenotypes spread out again, but do not segregate completely, suggesting that more than one locus controls lunar emergence time. The peaks around day 20 (gray shading) are direct responses to the artificial moonlight treatment (see Kaiser et al., 2021). They do not occur in the field and were not considered in our analyses. Yellow shading indicates the days with artificial moonlight treatment in the laboratory cultures.
Genetic linkage mapping confirmed the existence of the inversions In(1a) and In(1b), both of which are supported by inverted marker order on the genetic linkage map (Figure 5A). Marker order is also inverted for In(2L), but not for the other inversions (Figure 5A). We then performed stepwise QTL identification and Multiple QTL Mapping (MQM) as implemented in R/qtl and found four largely additive QTL for the difference in lunar reproductive timing (Figure 5A and Supplementary file 5). A specific scan for epistatic effects only shows a weak interaction of chromosome 1 with the QTL on chromosome 2 (Figure 5—figure supplement 1). The four QTL model explains 54% of the total variance in lunar reproductive timing and the estimated additive effects of the QTL account for 6 days of the timing difference between the FM and NM type (Figure 5A, Supplementary file 5). The individual loci explain 6.4%, 10.4%, 20.5%, and 10.8% of the variation in the phenotype (Supplementary file 5), but given the size of our mapping family these values could be overestimated by approximately threefold King and Long, 2017 due to the Beavis effect Slate, 2013; Xu, 2003. This suggests that the actual variance explained by these loci might be rather in the range of 2–7% and that additional loci of smaller effects are likely to contribute to the phenotype.

Quantitative trait loci (QTL) and genomic regions underlying FM vs NM emergence.
(A) Multiple QTL Mapping (MQM) identified four significant QTL controlling lunar emergence time. Confidence intervals were determined in MQM and Composite Interval Mapping (CIM). The marker order supports In(1a) and the nested In(1b) on chromosome 1, as well as In(2L) on chromosome 2. (B) QTL intervals were colour-coded, transferred to the reference sequence and assessed for loci which are strongly differentiated (FST) between the FM and NM types. Inversions are represented by blue bars below the plot. The QTL in In(1a) overlaps with the most differentiated locus in the genome (red asterisk) – the period locus. Other markedly differentiated loci within the QTL intervals are the plum/cask locus (red circle) and the stat1 locus (red cross). (C) Genetic differentiation is strongest in the first intron and the intergenic region just upstream of the period gene. (D) The period locus was re-genotyped based on an insertion-deletion mutation in the first intron (green arrows in (C)). The long (L) and short (S) period alleles are only loosely associated with inversion haplotypes.
In both MQM and additional Composite Interval Mapping (CIM; Figure 5—figure supplement 2), we estimated confidence intervals for the detected QTL (Figure 5A). There are two QTL on chromosome 1, one of which overlaps with the inversion system. The inversion genotypes of the parents were SI2 and I1I2 (Figure 5—figure supplement 3A), so that in the F1 we can expect partial recombination suppression, depending on the F1 genotypes. This is reflected in reduced map length for the respective genomic region (Figure 5A), but there was sufficient recombination to assign the QTL interval to only one end of In(1a). Based on the first genetic marker outside the QTL’s confidence intervals, we matched the QTL region with the corresponding genomic reference sequence (Figure 5B). The second QTL on chromosome 1 maps outside the inversion system and is well separated from the other QTL (Figure 5A and B). The other two QTL are found on chromosomes 2 and 3 and overlap with In(2R) and In(3L), respectively. The parental genotypes for In(2R) are SI and putatively II and for In(3L) are both SI (Figure 5—figure supplement 3B and C), so that we can again expect partial recombination suppression in the F1 and reduced map length. The genomic regions underlying these QTL are large and show several peaks in genetic differentiation (Figure 5B). Possibly, the QTL on chromosome 2 and 3 harbor a set of linked loci influencing the lunar emergence phenotype. Taken together, there are at least four unlinked loci which influence this magic trait. Given that the QTL mapping was based on a single pair of parents, the overall genetic architecture in the populations may be even more complex than what we described here.
Divergent alleles are not associated with specific inversion haplotypes
While some chromosomal inversions show up as genetically differentiated blocks, in all chromosomal inversions there are loci that are far more differentiated than the inversion itself (compare Figure 2L and Figure 3L to O with Figure 5B). This implies that the inversions are not required for protecting genetic differentiation. One explanation for such divergence peaks could be that individual SNPs in the ancestral arrangement are strongly differentiated between the timing types (while it is generally assumed that the inversion is fixed for one allele). Alternatively, there could be recombination and gene conversion placing the divergent alleles also in the inverted haplotypes. In both scenarios individual loci can diverge, while the surrounding inversions remain less differentiated. We tested these scenarios for the most differentiated locus in the genome (Figure 5B, red asterisk), which is the period locus (Figure 5C). We identified an insertion-deletion (indel) mutation in the highly differentiated first intron of the period gene (Figure 5C, green arrows) and re-genotyped the 48 individuals for this indel (Figure 5D; Figure 5—figure supplement 4). This confirmed that period alleles are indeed strongly differentiated between the FM and NM types. However, we could not find a close association between the period alleles and the inversion haplotypes of chromosome 1 (Figure 5D). Both alleles occur is several inversion haplotypes, confirming that there is gene flux between standard and inversion haplotypes. This implies that differentiation of the period locus is not driven by differentiation of the inversion system, that is it is not enhanced by LD with other locally adaptive variants.
Candidate loci
The overlap of genetic divergence peaks and the QTL’s 95% Bayesian intervals, allows us to identify potential candidate genes for controlling the magic trait, i.e. phenotypic divergence in lunar reproductive timing (see Supplementary file 6 for an overview). Some loci are particularly conspicuous in that they are by far the most divergent within a specific QTL. The QTL in the inversion system on chromosome 1 is narrow and harbors a single strongly divergent locus, which is the period locus (Figure 5B, red asterisk). The period locus is the most differentiated locus in the entire genome (maximum FST = 0.86) and period is a core circadian clock gene. The QTL at the end of chromosome 1, outside the inversion system, also shows a single strong divergence peak (Figure 5B, red circle). This is the first intron of the plum gene (Figure 5—figure supplement 5). The large intron likely contains regulatory regions for both plum and the directly adjacent cask gene. On chromosome arm 2R the QTL has several divergence peaks, but the strongest peak hits the stat1 gene (Figure 5B, red plus sign; Figure 5—figure supplement 6). Plum, cask, and stat1 are all involved in nervous system development in Drosophila melanogaster Yu et al., 2013; Gillespie and Hodge, 2013; Ngo et al., 2010, suggesting that divergence in lunar reproductive timing may involve nervous system remodeling as well as circadian timekeeping. These gene functions are plausible to be involved in lunar time-keeping and congruent with the results from a companion study Fuhrmann et al., 2023, which assessed the loss of lunar timing in a different set of Clunio populations. Thus, evidence from gene function, genetic divergence and QTL mapping align to support these candidate genes.
Discussion
We found multiple large and polymorphic chromosomal inversions in the sympatric FM and NM strains, covering all chromosome arms of the C. marinus genome. This is consistent with large non-pairing and inverted regions observed in C. marinus polytene chromsomes Michailova, 1980. These inversions reduce recombination, as is observed in large LD blocks (Figure 2—figure supplement 1 and Figure 3—figure supplement 1), and possibly lock several loci involved in divergent reproductive timing into supergenes (Figures 2A and 3A). However, most of these inversions are not differentiated between timing types and in all inversions, there are loci more divergent than the inversion itself. We show that divergent alleles at the period locus are not associated with specific inversion haplotypes or the standard haplotype (Figure 5D). Thus, there must be considerable recombination and/or gene conversion inside the inversion, placing the period alleles in different inversion haplotypes. This is likely also true for other divergent loci and forms the basis for genetic differentiation at individual loci to exceed the inversion background. We therefore propose that in this study system genetic linkage is not the major mechanism for coupling ecological divergence and assortative mating. Instead, we favor the magic trait scenario, in which recombination does not matter, because the relevant components of reproductive isolation are influenced by a single trait. Given a magic trait, all loci influencing the trait will be under divergent natural selection. We thus assume that peaks of genetic differentiation include the loci responsible for ecological adaptation, and that their differentiation is driven by permanent divergent natural selection. As a consequence, physical linkage between adaptive alleles and inversion haplotypes, though incomplete, may drive the spread of specific inversion haplotypes and thus differentiation of the inversion system on chromosome 1. Differentiation of the inversion system then creates a genetic substrate for further components of reproductive isolation to evolve and be genetically linked to the existing loci responsible for reproductive isolation. Eventually, this may lead to the completion of speciation.
At the same time, QTL mapping revealed at least four independent loci underlying lunar reproductive timing. Therefore, we are neither dealing with a single pleiotropic locus that affects several components of reproductive isolation, nor with a single supergene that would combine genetic variants underlying different components of reproductive isolation, such as local adaptation and assortative mating. Our data show that an oligogenic architecture can underlie ecological and reproductive divergence in sympatric populations. In addition, the four QTL we identified here for the difference in lunar reproductive timing between the FM and NM types in Roscoff do not overlap with the two QTL we identified previously for difference in lunar reproductive timing between Normandy (Por-1SL) and Basque Coast (Jean-2NM) strains Kaiser et al., 2016; Kaiser and Heckel, 2012. This indicates that within the species C. marinus, different populations have found different genetic solutions for adapting lunar reproductive timing to the local tidal regime. Given that both studies relied on a single crossing family, the overall genetic basis of lunar reproductive timing may even be more complex than what we have shown here.
During the last ice age, the English Channel did not exist Patton et al., 2017, setting the time frame of establishment of the sympatric Roscoff populations roughly to the last 10,000 years. Genetic differentiation between the inversion haplotypes (Figure 2—figure supplement 3) is higher than what would be expected to evolve in this time-frame. We thus assume that the inversions were already segregating in the ancestral populations or may have recently introgressed from a different geographic site. The latter might also be true for the genetic variants underlying sympatric ecological divergence.
The detection of period as the most differentiated locus between NM and FM types, overlapping with a QTL for lunar reproductive timing, suggests that this circadian clock gene might also be involved in lunar time-keeping. The other prominent candidates, plum, cask, and stat1 are involved in neuronal or synaptic development and plasticity. Notably, cask is a direct interaction partner of CaMKII Gillespie and Hodge, 2013, which we previously found to be involved in circadian timing differences in Clunio Kaiser et al., 2016. Hence, both QTL on chromosome 1 suggest a close connection of circadian and circalunar time-keeping. This is backed by recent findings in Baltic and Arctic Clunio strains, for which genomic analysis of the loss of lunar rhythms also implied circadian GO terms as the top candidates for affecting circalunar time-keeping, followed by GO terms involved in nervous system development Fuhrmann et al., 2023. Thus, our data not only elucidate the genetic architecture underlying a magic trait, but also give hints to the genetic basis of the yet enigmatic circalunar clock.
Methods
Sampling and laboratory strains
Laboratory strains for crossing experiments and field samples for whole genome sequencing of 48 individuals were available from a previous study Kaiser et al., 2021 (FM type = Ros-2FM; NM type = Ros-2NM). Laboratory strains were kept in standard culture conditions Neumann, 1966 at 20 °C and under a light-dark cycle of 16:8. An artificial moonlight cycle with four nights of dim light every 30 days served to synchronize reproduction in the laboratory strains.
Sequencing, read mapping, and genotype calling
DNA from the 48 field caught individuals were subject to whole genome sequencing in the Max Planck Sequencing Centre (Cologne) on an Illumina HiSeq3000 according to standard protocols. Independent sequencing runs were merged with the cat function. Adapters were trimmed with Trimmomatic Bolger et al., 2014 using the following parameters: ILLUMINACLIP <Adapter file>:2:30:10:8:true, LEADING:20, TRAILING:20, MINLEN:75. Overlapping read pairs were merged with PEAR Zhang et al., 2014 using -n 75 c 20 k and mapped to the Cluma_2.0 reference genome (available in the Open Research Data Repository of the Max Planck Society under DOI 10.17617/3.42NMN2; manuscript in preparation) with bwa mem Li and Durbin, 2009 version 0.7.15-r1140. Mapped reads were merged into a single file, filtered for -q 20 and sorted with samtools v1.9 Li et al., 2009. SNPs and small indels were called using GATK v3.7–0-gcfedb67 McKenna et al., 2010. All reads in the q20 sorted file were assigned to a single new read-group with ‘AddOrReplaceReadGroups’ script with LB = whatever PL = illumina PU = whatever parameters. Genotype calling was then performed with HaplotypeCaller and parameters --emitRefConfidence GVCF -stand_call_conf 30, recalibration of base qualities using GATK BaseRecalibrator with ‘-knownSites’. Preparing recalibrated BAM files with GATK PrintReads using -BQSR. Recalling of genotypes using GATK HaplotypeCaller with previously mentioned parameters. Individual VCF files were combined into a single file using GATK GenotypeGVCFs.
Genetic differentiation (FST)
The vcf file containing GATK-called SNPs and small indels from 48 Ros-2FM and Ros-2NM males was filtered with vcftools version 0.1.14 Danecek et al., 2011 for minor allele frequency of 0.05, maximum of two alleles, minimum quality (minQ) of 20, maximum missing genotypes of 20%, which finally left 721,000 variants. Genetic differentiation between the two populations (FST) was estimated using vcftools parameters --weir-fst-pop --fst-window-size 1 --fst-window-step 1.
Long-range linkage disequilibrium (LD)
Linkage disequilibrium was calculated between all variants along the three chromosomes in each of the populations in search of the signatures of large genomic inversions. Vcf file containing GATK-called SNPs and small indels from 24 Ros-2FM males was filtered for minor allele frequency of 0.20, maximum of two alleles, minimum quality (minQ) of 20, maximum missing genotypes per site of 20% leaving 344,331 variants. The same was done for the 24 Ros-2NM individuals resulting in 352,915 variants. The filtered vcf files were converted to plink input files and plink version 1.90 beta Chang et al., 2015 was used to calculate linkage disequilibrium between all variants with parameters --r2 --inter-chr. The large LD file was then sorted into 3 files, one for each chromosome. Breakpoints of large chromosomal inversions were approximated by plotting r2 and manually looking for obvious breaks in the r2 scores (Figure 2—figure supplement 1 and Figure 3—figure supplement 1; Supplementary file 2).
Principal component analysis (PCA)
PCA was run for the entire genome, for sliding windows along the chromosome (‘windowed PCA’), and for windows corresponding to the inversions (‘local PCA’). The vcf file containing GATK-called SNPs and small indels from 48 Ros-2FM and Ros-2NM males was filtered with vcftools version 0.1.14 Danecek et al., 2011 for minor allele frequency of 0.05, minimum quality (minQ) of 20, maximum missing genotypes per site of 20%, leaving 703.579 variants. Principal component analysis was calculated with plink version 1.90 beta Chang et al., 2015 using flags --nonfounders --pca var-wts --chr-set 3 no-xy no-mt. Windowed PCA: In order to investigate regions of the genome with unusual degree of variance, PCA was run in windows along the three chromosomes. Vcf files were subdivided into vcf files containing variants that belong to 500 kb sliding windows with 100 kb steps. They were further used to calculate PCA values for each window as noted above. Local PCA: To get the genotype-estimate of the identified inversions (see long-range disequilibrium section), we subdivided vcf files according to the estimated inversion breakpoints (Supplementary file 2), and calculated principal components 1 and 2 as described above. Number of variants that belong to each window are listed in Supplementary file 3.
Admixture
Ancestry and relatedness of the 48 individuals of Ros-2FM and Ros-2NM population were investigated with admixture version 1.3.0 Alexander et al., 2009. Admixture was run for the entire genome, for sliding windows along the chromosome (‘windowed admixture’), and for windows corresponding to the inversions (‘local admixture’). We used vcf files previously described in the PCA section as input. Admixture was calculated using k of 2–4. Windowed admixture: In order to identify regions of the genome with ancestry different from the general population, we ran admixture on vcf files containing variants that belong to 500 kb sliding windows with 100 kb steps. Local admixture: In order to genotype the inversions (see long-range disequilibrium section) we calculated admixture using vcf files containing only variants from those regions of the genome (Supplementary file 2; Supplementary file 3).
Observed heterozygosity
In order to complement the windowed and local PCA and admixture analyses and further corroborate the indirect genotyping of the chromosomal inversions, we calculated observed heterozygosity using vcftools version 0.1.14 Danecek et al., 2011 with --het flag. Proportion of observed heterozygosity was calculated for each file by dividing the number of observed heterozygotes with the number of variants. Patterns of heterozygosity along the chromosomes were assessed based on the vcf files containing variants that belong to 500 kb sliding windows with 100 kb steps (see PCA and admixture section). Finally, in order to calculate observed heterozygosity of the chromosomal inversions, observed heterozygosity was calculated per inversion window for each individual (Supplementary file 2; Supplementary file 3). Then observed heterozygosity was calculated and plotted for each inversion genotype according to local PCA and admixture.
Crosses, phenotyping, and genotyping
After synchronizing the NM and FM strains by applying different moonlight regimes, single pair crosses were performed. F1 egg clutches were reared individually and the emerging adults were allowed to mate freely within each F1 family, leading to several sets of F2 families that go back to a single pair of parents. One of these sets was picked for QTL mapping. As the second peak is not under clock control, but a direct response to moonlight Kaiser et al., 2021, analysis was restricted to individuals emerging in the first peak (n=158; compare non-shaded area in Figure 4D.). In the F2, the peaks are clearly separated from each other by days without any emergence. DNA was extracted with a salting out method Reineke et al., 1998 and amplified with the REPLI-g Mini Kit (Qiagen) according to the manufacturer’s instructions. The two parents were subject to whole genome sequencing with 2x150 bp reads on an Illumina HiSeq2000 according to standard protocols. Read mapping and genotype calling were performed as described above. Microsatellite and indel genotypes were obtained by custom parsing of the vcf files. PCR primers (Supplementary file 4) for amplifying the microsatellite and indel regions were designed with Primer3. Indels were PCR amplified and then run and scored on 1.5% agarose gels. Microsatellites were amplified with HEX- and FAM-labeled primers and run on an ABI PRISM 3100 Genetic Analyzer. Chromatograms were analyzed and scored with in R with the Fragman package Covarrubias-Pazaran et al., 2016. The resulting genotype matrix can be found in the R/qtl input file (Supplementary file 7).
Linkage and QTL mapping
Linkage analysis and QTL mapping were based on a set of F2 families derived from a single pair of parents. Analyses were performed in R/qtl according to the script in Source code 1. Briefly, the recombination fraction (est.rf) and distribution of alleles (checkAlleles) were assessed to confirm the quality of the input data. The correct marker order was inferred by rippling across all markers of each chromosome (ripple). Missing data (countX0) and error load (top.errorload) were assessed. Then the best model was obtained in a stepwise selection procedure (stepwiseqtl). Additional interactions were checked for in a two QTL scan (scantwo), but were negligible and not considered in the final model. Finally, a model with 4 non-epistatic QTL was subject to model fitting for Multiple QTL Mapping (fitqtl) and the QTL’s 95% Bayesian confidence intervals were estimated (bayesint). Composite Interval Mapping was performed in QTLcartographer Bioinformatics Research Center, 2006 with various selection procedures (forward, backward), exclusion windows (10 cM and 20 cM) and covariates (3, 5, 10; Figure 5—figure supplement 2). The LOD significance threshold was estimated by running 1000 permutations and a p value of 0.05. For the QTL locations in CIM the 1 LOD intervals were plotted.
PacBio data and structural variant calling
PacBio long read data was obtained for pools of 300–500 individuals from laboratory strains of Ros-2NM and Ros-2FM (Roscoff, France). DNA was extracted as above and sequenced with standard protocols on a PacBio Sequel II at the Max Planck Sequencing Facility in Cologne, Germany. Raw long-reads of Ros-2FM and Ros-2NM were mapped against the reference CLUMA 2.0 (publication in preparation) using NGM-LR v.0.2.7 Sedlazeck et al., 2018 with default settings. Alignments were sorted, filtered (q20) and indexed with samtools v.1.9 Li et al., 2009. Three different SV-calling tools were used per population to discover SVs. Sniffles v.1.0.11 Sedlazeck et al., 2018 was run with parameters -- min_het_af 0.1 and –genotype. SVIM v1.2.0 Heller and Vingron, 2019 was run with default parameters (svim alignment). Finally, Delly v0.8.6 Rausch et al., 2012 was run with parameters lr -y pb -q 20 --svtype (INS, insertions; INV, inversions; DUP, duplications; DEL, deletions) and the output VCF files per SV type were merged with a custom script. The resulting VCF files were then sorted using vcf-sort and filtered for quality “PASS” using a custom bash script. SURVIVOR v1.0.6 Jeffares et al., 2017 was used to filter variants for a minimum size of 300 bp and at least 5 reads supporting each variant (SURVIVOR filter NA 300–1 0 5). BND variants detected with Sniffles and SVIM calls were excluded with a custom bash script before using SURVIVOR merge (options set to 50 1 1 0 0 300) on all VCFs produced per population. The merged VCFs were then used as an input to reiterate SV calling with Sniffles using the same parameters as above plus --Ivcf option and --min_support 5 (minimum number of reads supporting a SV). Finally, SURVIVOR merge (same parameters as above) was used to merge the re-genotyped SVs detected in Ros-2NM and Ros-2FM. SV support was reported in Figures 1 and 2 if there was a breakpoint detected within 3 kb of the boundaries of the LD blocks. The full set of SV calls is given in Supplementary file 1.
Period genotyping
An insertion-deletion (indel) mutation in the period locus was genotyped. The fragment was PCR amplified (primers: 5’-GAATACTGAGTGTAAGACTTGGC and 5’-ACAACGTGACCTGTGACAAT) and run and scored on 1.5% agarose gels.
Materials and correspondence
Requests should be addressed to Tobias S. Kaiser (kaiser@evolbio.mpg.de).
Data availability
Sequencing data was submitted to ENA under project number PRJEB54033. The CLUMA2.0 reference genome is available on the Open Research Data Repository of the Max Planck Society (EDMOND) under https://doi.org/10.17617/3.42NMN2.
-
EBI European Nucleotide ArchiveID PRJEB54033. Roscoff FM and NM individuals.
References
-
Fast model-based estimation of ancestry in unrelated individualsGenome Research 19:1655–1664.https://doi.org/10.1101/gr.094052.109
-
Trimmomatic: a flexible trimmer for illumina sequence dataBioinformatics 30:2114–2120.https://doi.org/10.1093/bioinformatics/btu170
-
Recombination and speciationMolecular Ecology 14:2621–2635.https://doi.org/10.1111/j.1365-294X.2005.02617.x
-
The variant call format and vcftoolsBioinformatics 27:2156–2158.https://doi.org/10.1093/bioinformatics/btr330
-
Skepticism towards SANTA rosalia, or why are there so few kinds of animals?Evolution; International Journal of Organic Evolution 35:124–138.https://doi.org/10.1111/j.1558-5646.1981.tb04864.x
-
Sympatric speciation in the genomic eraTrends in Ecology & Evolution 33:85–95.https://doi.org/10.1016/j.tree.2017.11.003
-
BookFitness Landscapes and the Origin of SpeciesPrinceton University Press.https://doi.org/10.1515/9780691187051
-
Cask regulates CaMKII autophosphorylation in neuronal growth, calcium signaling, and learningFrontiers in Molecular Neuroscience 6:27.https://doi.org/10.3389/fnmol.2013.00027
-
SVIM: structural variant identification using mapped long readsBioinformatics 35:2907–2915.https://doi.org/10.1093/bioinformatics/btz041
-
Annual, Lunar, and Tidal Clocks: Patterns and Mechanisms of Nature’s Enigmatic RhythmsLocal adaptations of circalunar and circadian clocks: the case of Clunio marinus, Annual, Lunar, and Tidal Clocks: Patterns and Mechanisms of Nature’s Enigmatic Rhythms, Springer, 10.1007/978-4-431-55261-1.
-
Timing strains of the marine insect Clunio marinus diverged and persist with gene flowMolecular Ecology 30:1264–1280.https://doi.org/10.1111/mec.15791
-
The beavis effect in next-generation mapping panels in Drosophila melanogasterG3: Genes, Genomes, Genetics 7:1643–1652.https://doi.org/10.1534/g3.117.041426
-
The sequence alignment/map format and samtoolsBioinformatics 25:2078–2079.https://doi.org/10.1093/bioinformatics/btp352
-
ConferenceComparative External Morphological and Karyological Characteristics of European Species of Genus Clunio HALIDAY 1855 VIIInternational Symposium of Chironomidae. pp. 9–15.
-
Die lunare und tägliche Schlüpfperiodik der Mücke Clunio - Steuerung und Abstimmung auf die GezeitenperiodikZeitschrift Für Vergleichende Physiologie 53:1–61.https://doi.org/10.1007/BF00343045
-
Genetic adaptation in emergence time of Clunio populations to different tidal conditionsHelgoländer Wissenschaftliche Meeresuntersuchungen 15:163–171.https://doi.org/10.1007/BF01618620
-
Deglaciation of the Eurasian ice sheet complexQuaternary Science Reviews 169:148–172.https://doi.org/10.1016/j.quascirev.2017.05.019
-
Preparation and purification of DNA from insects for AFLP analysisInsect Molecular Biology 7:95–99.https://doi.org/10.1046/j.1365-2583.1998.71048.x
-
Genomics and the origin of speciesNature Reviews. Genetics 15:176–192.https://doi.org/10.1038/nrg3644
-
Magic traits in speciation: “ magic ” but not rare?Trends in Ecology & Evolution 26:389–397.https://doi.org/10.1016/j.tree.2011.04.005
-
From Beavis to beak color: a simulation study to examine how much QTL mapping can reveal about the genetic architecture of quantitative traitsEvolution; International Journal OF Organic Evolution 67:1251–1262.https://doi.org/10.1111/evo.12060
-
The role of allochrony in speciationMolecular Ecology 26:3330–3342.https://doi.org/10.1111/mec.14126
-
Sympatric speciation in animals: the ugly duckling grows upTrends in Ecology & Evolution 16:381–390.https://doi.org/10.1016/S0169-5347(01)02188-7
-
Theoretical basis of the Beavis effectGenetics 165:2259–2268.https://doi.org/10.1093/genetics/165.4.2259
Decision letter
-
Detlef WeigelSenior and Reviewing Editor; Max Planck Institute for Biology Tübingen, Germany
Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.
Decision letter after peer review:
Thank you for submitting your article "An oligogenic architecture underlying ecological and reproductive divergence in sympatric populations" for consideration by eLife. Your article has been reviewed by 2 peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Detlef Weigel as the Senior Editor. The reviewers have opted to remain anonymous.
The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission. As you will see, both expert reviewers like the topic and the system you have established, but have also significant concerns, mostly about interpretation, but some also regarding technical issues.
Essential revisions:
Please be more stringent in your discussion and interpretation of 'magic traits' and their genetic basis, and especially how they can be differentiated from related genetic and evolutionary phenomena.
State clearly that the conclusions are drawn from a single cross rather than population-wide linkage mapping and that it cannot be excluded that the species-wide genetic architecture of the trait of interest is more complex than suggested here.
Temper your claims of genomic regions of high differentiation being the causal drivers of the ecological differentiation.
Add technical detail, as outlined in the individual reviews.
Reviewer #1 (Recommendations for the authors):
This manuscript describes a very interesting set of experiments on co-located populations of marine midge that emerge either at full or at new moon. There are several aspects of the study that will attract broad attention: the genetic architecture of a trait involved in reproductive isolation, the role of chromosomal inversions in population divergence and the link between the circadian and circalunar clocks. The authors choose to focus on 'magic traits'. This is not a straightforward idea. For a broad audience, it is important that the idea is explained carefully and I do not think that the current Introduction provides such a clearly-argued context. I have made specific comments by line numbers below. A related issue is whether the timing difference is actually under divergent selection. Probably it is but this is important for the argument in the manuscript and so reasons for expecting divergent selection and the likely nature of the selection should be described.
The experiments and analyses generally appear robust and appropriately interpreted. I have some reservations that are detailed below but I doubt any of them will influence the major conclusions. Data are not presented for long-read detection of SVs. There is generally some uncertainty about the interpretation of inversions, but this is handled well. The inversion genotypes of individuals used in the QTL mapping crosses were not determined (or at least they are not presented). This influences the direct interpretation of the crosses but it may also influence the larger picture because the mapping cross is just between two individuals. If these individuals were homozygous for the same arrangement of a given inversion, the cross might give a very different outcome from one between individuals homozygous for different arrangements. This is a particular case of the general problem of interpreting a cross between two individuals as if it represents the genetic basis of the difference between two populations.
More specific comments follow by line number:
30-32 – this is not a great summary because it starts by contrasting two extremes and then effectively says that there is evidence for some kind of intermediate. It is surprising that Coyne and Orr's book is not cited since it is probably the best summary of the debate up to 2004.
33 – 'mechanisms' is not a good choice of word here since it implies something that is fabricated by selection for the purpose of isolation
34-36 – also cite Felsenstein 1981. Note that 'ecological divergence' is a component of 'reproductive isolation' and so the wording here is problematic. It should be something like 'ecological divergence and other components of reproductive isolation'. This applies at multiple points in the manuscript.
37 – while ref. 8 is the origin of the term 'magic trait', ref. 13 provided an important discussion of the idea and ref. 6 provided an alternative terminology as well as clarifying the relationship with pleiotropy. If the magic trait/multiple-effect trait idea is central to this manuscript, some of this literature should be cited.
41 – see ref. 6 for a view on the role of pleiotropy. In my view, nothing in the magic trait idea implies a small number of genes: the point is that the trait contributes to multiple components of reproductive isolation and this can be true regardless of its genetic basis. This is part of the reason why confounding the idea with pleiotropy is unhelpful.
42-44 – the theoretical basis for this claim should be supported with citations, as well as giving examples in the next sentence.
65 – actually, this is not enough to meet the magic trait definition because there also need to be evidence for 'ecological divergence': the FM and NM types need to be locally adapted to their niches in some sense. Alternatively, hybrids might suffer a fitness cost – a form of incompatibility. This would fit the multiple-effect trait definition advanced by ref. 6 but not the definition of magic traits used here (although I think it might fit the definition originally used by Gavrilets).
72 – I think the authors probably mean 'linkage disequilibrium' here, rather than 'genetic linkage'.
91-2 – some caution is needed here since F1 and BX hybrids alone do not represent gene flow. A plot of heterozygosity against hybrid index, or an analysis using NewHybrids would help to show whether later generation hybrids occur. However, caution is also needed because of segregation of structural variants that might be shared between ecotypes and can distort Fst and admixture analyses.
98-100 – meaning 'polymorphic in both timing types', I assume. It seems that the data behind the SV calls in figure 1C are not shown anywhere.
Figure 1 and S4 – running the windowed PCA and the PCAs in Figure 1 without the 4 outlying FM individuals might well give clearer patterns. The rationale behind the grouping into genotypes in Figure 1 is not clear.
134-140 – support for I3 is relatively weak, but that is clear from the text.
141-6 – this partly answers the question about assignment to genotypes – it uses more than just the PCA shown in Figure 1. At present, this is all quite hard to access, even though I am used to plots of this type.
150-2 – it may be worth noting that this helps to explain the LD patterns in Figure S2: because NM segregates only for I1 and I2, there is LD only in the I2 region (the only region where recombination is suppressed). In FM, recombination is suppressed over the length of I1, but we might expect more gene flux in the area of I2 because it is collinear in SI2 individuals. However, this is all without thinking about gene flow between FM and NM. The strong differences in LD seem quite surprising to me, given the previous inference of extensive gene flow.
Table S1 gives the margins of the area of SD (with surprisingly high precision). Since LD can extend well beyond breakpoints it may be better not to think of these as estimated breakpoint positions.
158 – again, SV calling data are not shown.
Figure 3 – it is unfortunate that the 'peaks around day 20' overlap with the 'natural' distributions, and no single cut-off can separate them. Do points around day 25 belong to the 'spurious' distribution? Are conclusions about QTL sensitive to decisions about boundaries?
168 – maybe 'at least four QTL' since it is impossible to exclude the presence of other loci or of more loci under the broad peaks observed (indeed, Figure S8 does suggest multiple loci under some peaks).
Figure 4A,B – it is not clear why two lines are plotted in 4A for Chr1. The inference that the gene order is 'reversed' in I1 and I2 is relative to the reference genome (and actually quite hard to see in the current plot). Importantly, there is no evidence of recombination suppression in the cross analysed, which implies that the FM and NM parents were homozygous for the same (inverted) arrangement. This is actually quite surprising, given the arrangement frequencies inferred from Figure 1. However, it is unclear whether the methods used would have picked up partial recombination suppression due to heterozygosity in the grandparents (leading to heterozygosity in some but not all of the F1 individuals).
185 – does 'largely additive' mean that dominance effects were small or not significant? In Table S4d, the dominance effects look non-significant but there is no comparison of models with or without dominance. Fig, S7B,C have not come through well in my PDF. Here the tests for epistasis seems to exclude dominance.
187-90 – this, of course, confirms that other loci are likely to contribute.
193-5 – see comments above on Figure 4A,B. It needs to be clear what the inferred inversion genotypes of the parents and F1 were in these crosses. This influence interpretation substantially.
199 – but these inversions were also apparently not segregating in the cross families.
207 – Figure 4B does not make this clear because it does not show Fst for each inversion, only for SNPs within inversion. Also note that a new inversion is typically fixed for one allele at any locus where the ancestral arrangement is polymorphic. There are then many combinations of inversion and SNP frequencies in two populations for which Fst is higher for the SNP than for the inversion, without any requirement for gene flux (as suggested on line 209). Gene flux is needed for cases where both arrangements carry both alleles (as seems to happen in some cases in Figure 4D, although the situation here is complicated by the putatively overlapping inversions which can enhance gene flux).
217-8 – there is no expectation that differentiation at a single locus will be protected by an inversion so it would help to reword this sentence a bit. I think the authors mean that differentiation at period is not enhanced by association with other locally-adaptive variants that is maintained by an inversion.
250-1 – this comes back to issues in the Introduction with the 'magic trait' idea. The basis of this idea is that recombination is less important because recombination cannot separate the two (or more) components of reproductive isolation influenced by the trait. If one component of isolation is local adaptation (as proposed here but not directly established) then each locus influencing the trait will be under divergent selection. An inversion might be favoured because it maintains advantageous combinations of these loci (the Kirkpatrick-Barton mechanism) but this is different from the case where assortative mating (for example) is not under direct selection and so only evolves in response to indirect selection (as in classical models of reinforcement).
251-3 – this is a slightly odd distinction. I think it needs to be explained a bit more fully and, ideally, related to relevant models for the spread of inversions.
265-7 – this comment about what is 'generally assumed' should be supported by citations. It reflects a view in the Introduction that 'magic traits' are expected to have a simple genetic basis, a view that I consider incorrect (see earlier comments).
272-3 – I am not sure that it is helpful to introduce the idea of 'evolutionary branching' here without further discussion. It would be more in line with the rest of the manuscript to say that the timing ecotypes might accumulate other components of reproductive isolation and eventually complete speciation.
287-8 – in line with previous comments, I think that describing the genetic basis of a putative ‘magic trait’ is a valuable contribution but does not require any re-evaluation.
Methods
328-30 – this identifies the margins of the region of LD, which do not necessarily correspond to breakpoints.
372-3 – this does not provide sufficient information – see comments on the Results.
384-5 – this analysis used all F2 families derived from one pair of grandparents, I believe. This should be made explicit. The inversion genotypes of the parents should be determined (and ideally of the F1 individuals, but this would have to be inferred from their offspring, I think), because the parents may have been heterozygous and so the F1 may have varied.
389-90 – so far as I can see this was without formal testing for dominance effects.
395-413 – it seems that the results of these analyses are not available anywhere.
Reviewer #2 (Recommendations for the authors):
This study system promises to hold exciting insights into the genetic basis of ecological traits but there are limitations in the analyses and interpretation that temper my overall enthusiasm for the paper. These are included in my public review but I list them with specific analysis suggestions below:
1) The QTL mapping relies on a modest number of individuals and there are important details missing. The authors should provide the overall heritability of the trait, which can be estimated from their parental and offspring data with some assumptions. For the estimates of QTL effect sizes from such small samples, there is a common issue known as the Beavis effect that can inflate the effect size of individual QTL. Related to this, the total effect size of the QTL reported is extremely high and the authors should consider (e.g. with simulations) that the effect size is substantially inflated. The authors should report individual effect sizes and explore their power to detect QTL with different effect sizes given their study design. I think that consideration of heritability and overall power will help with interpretation of this result.
2) My major issue with the paper is the interpretation of divergence within the inversion as linked causally to the genes underlying ecological divergence. As the authors observe, divergence will vary within an inverted region. This can be traced to myriad factors, including variation in mutation rate, variation in constraint, patterns of ancestral polymorphism within this region, and variation in gene conversion within the inversion. Given this, I do not think it is valid to interpret the regions of high differentiation as the causal drivers of the ecological differentiation. In my view, the results need to be reframed in the context of ecological adaptation dependent on large inversions and potentially interesting candidate genes, without a claim that the results can be narrowed to individual genes.
3) The authors imply in the discussion based on historical results that the ecotype evolved in situ in the last ~60 years. This speculation should be removed as it seems much more likely that the ecotype was either missed in sampling or a recent migrant introduced it to the area in the recent past.
https://doi.org/10.7554/eLife.82825.sa1Author response
Essential revisions:
Please be more stringent in your discussion and interpretation of 'magic traits' and their genetic basis, and especially how they can be differentiated from related genetic and evolutionary phenomena.
We have completely rewritten the introduction as well as the discussion, in order to now make a clear distinction between magic traits, pleiotropy and genetic linkage as mechanisms for coupling different components of reproductive isolation.
State clearly that the conclusions are drawn from a single cross rather than population-wide linkage mapping and that it cannot be excluded that the species-wide genetic architecture of the trait of interest is more complex than suggested here.
We have included respective statements in the results (lines 209-218; 259-261), discussion (lines 386-398) and method sections (lines 481-482).
Temper your claims of genomic regions of high differentiation being the causal drivers of the ecological differentiation.
We tempered our wording and now speak of “potential candidate genes” which “may” point to specific molecular mechanisms to be involved in ecological divergence.
Add technical detail, as outlined in the individual reviews.
We have added technical detail in many sections, as requested by the reviewers (for details see specific replies below).
Reviewer #1 (Recommendations for the authors):
This manuscript describes a very interesting set of experiments on co-located populations of marine midge that emerge either at full or at new moon. There are several aspects of the study that will attract broad attention: the genetic architecture of a trait involved in reproductive isolation, the role of chromosomal inversions in population divergence and the link between the circadian and circalunar clocks.
We thank the reviewer for this positive assessment of our work.
The authors choose to focus on 'magic traits'. This is not a straightforward idea. For a broad audience, it is important that the idea is explained carefully and I do not think that the current Introduction provides such a clearly-argued context. I have made specific comments by line numbers below.
We have entirely revised the introduction in order to provide a clearer context (lines 30-62), based on the very helpful detailed comments that the reviewer gave below.
A related issue is whether the timing difference is actually under divergent selection. Probably it is but this is important for the argument in the manuscript and so reasons for expecting divergent selection and the likely nature of the selection should be described.
We have inserted the reasoning for why divergent selection is expected (lines 75-88).
The experiments and analyses generally appear robust and appropriately interpreted. I have some reservations that are detailed below but I doubt any of them will influence the major conclusions. Data are not presented for long-read detection of SVs. There is generally some uncertainty about the interpretation of inversions, but this is handled well. The inversion genotypes of individuals used in the QTL mapping crosses were not determined (or at least they are not presented). This influences the direct interpretation of the crosses but it may also influence the larger picture because the mapping cross is just between two individuals. If these individuals were homozygous for the same arrangement of a given inversion, the cross might give a very different outcome from one between individuals homozygous for different arrangements. This is a particular case of the general problem of interpreting a cross between two individuals as if it represents the genetic basis of the difference between two populations.
We appreciate that the reviewer considers our analyses robust and appropriately interpreted. Nevertheless, the reviewer makes very detailed and helpful suggestions, which clearly improve our manuscript, even though the major conclusions are indeed not influenced. We thank the reviewer for these comments. As all of the points raised here are detailed below, we respond to them one by one below.
More specific comments follow by line number:
30-32 – this is not a great summary because it starts by contrasting two extremes and then effectively says that there is evidence for some kind of intermediate. It is surprising that Coyne and Orr's book is not cited since it is probably the best summary of the debate up to 2004.
We have revised this section and inserted the citation (lines 30-34).
33 – 'mechanisms' is not a good choice of word here since it implies something that is fabricated by selection for the purpose of isolation
We have replaced “mechanism” with “component” (line 36).
34-36 – also cite Felsenstein 1981. Note that 'ecological divergence' is a component of 'reproductive isolation' and so the wording here is problematic. It should be something like 'ecological divergence and other components of reproductive isolation'. This applies at multiple points in the manuscript.
We have inserted the citation and changed the wording as suggested (lines 37-39).
37 – while ref. 8 is the origin of the term 'magic trait', ref. 13 provided an important discussion of the idea and ref. 6 provided an alternative terminology as well as clarifying the relationship with pleiotropy. If the magic trait/multiple-effect trait idea is central to this manuscript, some of this literature should be cited.
We have changed the citations as requested and introduced a clear distinction between pleiotropy, magic traits and genetic linkage (lines 39-59).
41 – see ref. 6 for a view on the role of pleiotropy. In my view, nothing in the magic trait idea implies a small number of genes: the point is that the trait contributes to multiple components of reproductive isolation and this can be true regardless of its genetic basis. This is part of the reason why confounding the idea with pleiotropy is unhelpful.
We agree with this comment and have inserted a statement that the concept of a magic trait does not include an assumption on the genetic basis of the trait and should not be confused with pleiotropy (lines 44-46).
42-44 – the theoretical basis for this claim should be supported with citations, as well as giving examples in the next sentence.
We have inserted both a citation and examples (lines 51-59).
65 – actually, this is not enough to meet the magic trait definition because there also need to be evidence for 'ecological divergence': the FM and NM types need to be locally adapted to their niches in some sense. Alternatively, hybrids might suffer a fitness cost – a form of incompatibility. This would fit the multiple-effect trait definition advanced by ref. 6 but not the definition of magic traits used here (although I think it might fit the definition originally used by Gavrilets).
We have inserted an explanation of how and why the FM and NM types are expected to be under divergent ecological selection (lines 75-84). Interestingly, it is also true that hybrids are expected to suffer a fitness cost. We have inserted this as a second component of reproductive isolation (lines 84-88).
72 – I think the authors probably mean 'linkage disequilibrium' here, rather than 'genetic linkage'.
We replaced 'genetic linkage' with 'linkage disequilibrium' (line 95).
91-2 – some caution is needed here since F1 and BX hybrids alone do not represent gene flow. A plot of heterozygosity against hybrid index, or an analysis using NewHybrids would help to show whether later generation hybrids occur. However, caution is also needed because of segregation of structural variants that might be shared between ecotypes and can distort Fst and admixture analyses.
We have rephrased the section to be more cautious about the issues the reviewer is raising here (lines 115-117).
98-100 – meaning 'polymorphic in both timing types', I assume. It seems that the data behind the SV calls in figure 1C are not shown anywhere.
We have inserted that inversions are ´polymorphic´ (lines 123-126). The results of SV calling are now given in a new Supplementary File 1 and the sequence data behind it are included in our ENA submission PRJEB54033.
Figure 1 and S4 – running the windowed PCA and the PCAs in Figure 1 without the 4 outlying FM individuals might well give clearer patterns. The rationale behind the grouping into genotypes in Figure 1 is not clear.
We have performed PCA without the 4 outlying FM individuals, but the overall picture did not get clearer. Therefore, we decided to show the full dataset. The rationale behind the grouping into genotypes is explained in the text (lines 132-168).
134-140 – support for I3 is relatively weak, but that is clear from the text.
As the reviewer considers weak support for I3 to be clear from the text, we did not make any changes.
141-6 – this partly answers the question about assignment to genotypes – it uses more than just the PCA shown in Figure 1. At present, this is all quite hard to access, even though I am used to plots of this type.
Yes, we also relied on heterozygosity and admixture analyses. In order to make this clearer, we rewrote the corresponding section of the results (lines 145-149) and inserted a new “Figure 2—figure supplement 2” with the local admixture analysis.
150-2 – it may be worth noting that this helps to explain the LD patterns in Figure S2: because NM segregates only for I1 and I2, there is LD only in the I2 region (the only region where recombination is suppressed). In FM, recombination is suppressed over the length of I1, but we might expect more gene flux in the area of I2 because it is collinear in SI2 individuals. However, this is all without thinking about gene flow between FM and NM. The strong differences in LD seem quite surprising to me, given the previous inference of extensive gene flow.
Table S1 gives the margins of the area of SD (with surprisingly high precision). Since LD can extend well beyond breakpoints it may be better not to think of these as estimated breakpoint positions.
We inserted the explanation of the LD blocks based on the distribution of haplotypes (lines 188-192). We also think of the margins of LD as approximations of the inversion breakpoints and state this clearly in Supplementary File 2, as well as in the methods (line 420-423).
158 – again, SV calling data are not shown.
We have inserted the SV calling data and now refer to it in the text (lines 123-126). See comment above.
Figure 3 – it is unfortunate that the 'peaks around day 20' overlap with the 'natural' distributions, and no single cut-off can separate them. Do points around day 25 belong to the 'spurious' distribution? Are conclusions about QTL sensitive to decisions about boundaries?
It is indeed unfortunate that the spurious peaks do partially overlap with the natural distribution of the parental strains. However, QTL analysis solely relies on the phenotype distribution of the F2. Therefore, a single cut-off for all distributions is not required. In the F2 the spurious peak and the actual peak are separated by at least 1 day without any emergence. Hence, the decision about the boundaries is clear and we did not have to test different boundaries in QTL mapping.
Individuals emerging around day 25 indeed belong to the spurious peak. We have adjusted the figure accordingly (now Figure 4).
168 – maybe 'at least four QTL' since it is impossible to exclude the presence of other loci or of more loci under the broad peaks observed (indeed, Figure S8 does suggest multiple loci under some peaks).
We have changed the wording to at least 4 QTL (lines 209-210). We also noticed that CIM in Figure 5—figure supplement 2 (previously Figure S8) suggests multiple loci under some peaks, but these patterns change with size of the exclusion window in CIM, so we decided not to mention them in order not to over-interpret the CIM.
Figure 4A,B – it is not clear why two lines are plotted in 4A for Chr1. The inference that the gene order is 'reversed' in I1 and I2 is relative to the reference genome (and actually quite hard to see in the current plot). Importantly, there is no evidence of recombination suppression in the cross analysed, which implies that the FM and NM parents were homozygous for the same (inverted) arrangement. This is actually quite surprising, given the arrangement frequencies inferred from Figure 1. However, it is unclear whether the methods used would have picked up partial recombination suppression due to heterozygosity in the grandparents (leading to heterozygosity in some but not all of the F1 individuals).
There are two lines for Chr1 because two QTL were detected in Multiple QTL Mapping, and then the LOD profiles for both QTL are plotted along the chromosome. Indeed, the inference that the gene order is reversed is relative to the reference genome.
Regarding recombination suppression, there is partial recombination suppression in Chr1. Note that the genetic distance on the linkage map between markers C1M7 and C1M5 is much smaller on the genetic map compared to the corresponding sequence in the reference genome. We have now determined the inversion genotypes for the parents of the cross to be SI2 and I1I2 (new Figure 5—figure supplement 3) and these genotypes explain the partial recombination suppression in the F1 generation.
185 – does 'largely additive' mean that dominance effects were small or not significant? In Table S4d, the dominance effects look non-significant but there is no comparison of models with or without dominance. Fig, S7B,C have not come through well in my PDF. Here the tests for epistasis seems to exclude dominance.
Indeed, the dominance effects given by the fitqtl function (see Supplementary File 5d) are small. The tests for epistasis (scantwo) in “Figure 5—figure supplement 1” (previously Supplementary Figur 7) did not produce significant dominance interactions either. In scantwo there was a significant interaction for two loci (c1:c2), but in the model selection procedure implemented in stepwiseqtl this interaction did not significantly improve the full model and was therefore dropped again from the analysis.
187-90 – this, of course, confirms that other loci are likely to contribute.
Yes, indeed. We included this statement (lines 239-240).
193-5 – see comments above on Figure 4A,B. It needs to be clear what the inferred inversion genotypes of the parents and F1 were in these crosses. This influence interpretation substantially.
The inversion genotypes of the parents were determined to be SI2 and I1I2 (new “Figure 5—figure supplement 3”), so that we can expect partial recombination suppression in the F1. This information was inserted in the Results section (lines 244-246; 253-255).
199 – but these inversions were also apparently not segregating in the cross families.
Here the parental inversion genotypes were SI and II or both SI respectively, again resulting in partial recombination suppression in the F1. We have inserted this information in the Results section (lines 253-255).
207 – Figure 4B does not make this clear because it does not show Fst for each inversion, only for SNPs within inversion. Also note that a new inversion is typically fixed for one allele at any locus where the ancestral arrangement is polymorphic. There are then many combinations of inversion and SNP frequencies in two populations for which Fst is higher for the SNP than for the inversion, without any requirement for gene flux (as suggested on line 209). Gene flux is needed for cases where both arrangements carry both alleles (as seems to happen in some cases in Figure 4D, although the situation here is complicated by the putatively overlapping inversions which can enhance gene flux).
We have calculated the FST values for inversions 2L, 2R, 3L and 3R and added them to Figure 3 (previously Figure 2). In the text we now refer to both figures 3 and 5 (previously 2 and 4), to make the comparison (lines 263-266). We are aware that a new inversion is usually fixed for one inverted allele, while the ancestral arrangement is polymorphic, allowing for FST values of polymorphic SNPs within the inversion to exceed the FST values of the inversion. This implies that individual SNPs within the ancestral arrangement are strongly differentiated between the timing types (while the inverted haplotype is considered to be fixed for one allele, which actually sets some limits to genetic differentiation of the SNP). We have included this possibility in the text (lines 267-269). Our test for the period locus then shows that there is not just differentiation in the ancestral arrangement, but gene flux between the arrangements (lines 273-284).
217-8 – there is no expectation that differentiation at a single locus will be protected by an inversion so it would help to reword this sentence a bit. I think the authors mean that differentiation at period is not enhanced by association with other locally-adaptive variants that is maintained by an inversion.
We agree with the reviewer and have reworded the sentence accordingly (lines 281-284).
250-1 – this comes back to issues in the Introduction with the 'magic trait' idea. The basis of this idea is that recombination is less important because recombination cannot separate the two (or more) components of reproductive isolation influenced by the trait. If one component of isolation is local adaptation (as proposed here but not directly established) then each locus influencing the trait will be under divergent selection. An inversion might be favoured because it maintains advantageous combinations of these loci (the Kirkpatrick-Barton mechanism) but this is different from the case where assortative mating (for example) is not under direct selection and so only evolves in response to indirect selection (as in classical models of reinforcement).
We have completely rewritten this paragraph (lines 321-337). Having now introduced a clear distinction between genetic linkage, magic traits and pleiotropy in the introduction, we here argue that given our data we don’t think genetic linkage is the mechanism for coupling the components of reproductive isolation. We elaborate on why recombination is not important for a magic trait and that we expect divergent selection on all loci affecting the magic trait. We conclude that selection on these loci might favor the spread of certain inversion haplotypes and that this creates a new substrate for additional, genetically linked components of reproductive isolation to evolve.
251-3 – this is a slightly odd distinction. I think it needs to be explained a bit more fully and, ideally, related to relevant models for the spread of inversions.
In course of rewriting the whole paragraph (see comment above), we no longer make this distinction. We now explicitly refer on how natural selection might drive the spread of certain inversions (lines 328-333).
265-7 – this comment about what is 'generally assumed' should be supported by citations. It reflects a view in the Introduction that 'magic traits' are expected to have a simple genetic basis, a view that I consider incorrect (see earlier comments).
We agree with the reviewer that magic traits do not need to have a simple genetic basis. Therefore, we have removed the respective section from the discussion.
272-3 – I am not sure that it is helpful to introduce the idea of 'evolutionary branching' here without further discussion. It would be more in line with the rest of the manuscript to say that the timing ecotypes might accumulate other components of reproductive isolation and eventually complete speciation.
We have rewritten the entire paragraph based on comments from the other reviewer. As a consequence, the idea of evolutionary branching is no longer mentioned. The idea that timing ecotypes might accumulate other components of reproductive isolation and eventually complete speciation was added to the paragraph above (lines 328-337).
287-8 – in line with previous comments, I think that describing the genetic basis of a putative ‘magic trait’ is a valuable contribution but does not require any re-evaluation.
We agree with the reviewer and have rewritten the sentence accordingly (lines 378-380).
Methods
328-30 – this identifies the margins of the region of LD, which do not necessarily correspond to breakpoints.
We are fully aware that the region of LD can only approximate the breakpoints and have changed the wording accordingly (lines 420-423).
372-3 – this does not provide sufficient information – see comments on the Results.
We inserted a statement of how the peaks are delineated (lines 466-469).
384-5 – this analysis used all F2 families derived from one pair of grandparents, I believe. This should be made explicit. The inversion genotypes of the parents should be determined (and ideally of the F1 individuals, but this would have to be inferred from their offspring, I think), because the parents may have been heterozygous and so the F1 may have varied.
The information that it was a single pair of parents was already given in the section above (lines 463-466). For clarity, we have repeated it in the section on QTL mapping (lines 481-482). We did determine the inversion genotypes of the parents and have inserted the information in the text (lines 244-246; 253-255, see comments above). Indeed, the F1 likely varied for inversion genotypes, leading to recombination suppression in some families, but not others, i.e. partial recombination suppression overall.
389-90 – so far as I can see this was without formal testing for dominance effects.
The R/qtl function we used (fitqtl) includes the calculation of dominance effects, which are given in Supplementary File 5d (previously Supplementary Table 4). We wrote that the model we used (y = Q1 + Q2 + Q3 + Q4) consisted of “four additive QTL” as a contrast to putative epistatic effects, not as a contrast to dominance effects. In order to avoid the confusion, we have replaced “additive” with “non-epistatic” (lines 489-491).
395-413 – it seems that the results of these analyses are not available anywhere.
Initially, we only picked the support we could find for specific inversions from the large SV dataset. We now include the global results of SV calling (Supplementary File 1) and the sequence data behind it (ENA accession PRJEB54033). We did not include a detailed analysis of all SV calls, because we consider it distracting from the story presented here. A full account of the SV calls will be given in a separate manuscript.
Reviewer #2 (Recommendations for the authors):
This study system promises to hold exciting insights into the genetic basis of ecological traits but there are limitations in the analyses and interpretation that temper my overall enthusiasm for the paper. These are included in my public review but I list them with specific analysis suggestions below:
1) The QTL mapping relies on a modest number of individuals and there are important details missing. The authors should provide the overall heritability of the trait, which can be estimated from their parental and offspring data with some assumptions. For the estimates of QTL effect sizes from such small samples, there is a common issue known as the Beavis effect that can inflate the effect size of individual QTL. Related to this, the total effect size of the QTL reported is extremely high and the authors should consider (e.g. with simulations) that the effect size is substantially inflated. The authors should report individual effect sizes and explore their power to detect QTL with different effect sizes given their study design. I think that consideration of heritability and overall power will help with interpretation of this result.
We appreciate these comments but we are not sure how they would affect our overall conclusions. Heritability is evidently dependent on the environmental conditions, which would be rather different in nature, compared to the laboratory conditions. It is not clear what we would gain by calculating a heritability for our QTL experiment. Additionally, even if we make the assumption that environmental variance is equal in the F1 and F2 generations and thus calculate the genetic variance as the difference between the variances in the F1 and the F2, we are still lacking an independent estimate of environmental variance and we can still not calculate broad sense heritability. We do not see which additional reasonable assumptions we can make in order to obtain an estimate of heritability. But we provide now the individual effect sizes for the QTLs in the text (they were previously somewhat hidden in the suppl. tables). In addition, we point out that they could be overestimated about 3-fold, due to the Beavis effect (lines 236-240). Given that all evidence points to an oligogenic architecture, an overestimate of effect sizes is actually a conservative assumption.
Similarly, we think that performing simulations to estimate the power of our QTL mapping approach would not help the overall argument (and is also somewhat beyond the scope of the paper). We expect that we are likely missing QTL with minor effects and state so in the revised manuscript (lines 239-240).
2) My major issue with the paper is the interpretation of divergence within the inversion as linked causally to the genes underlying ecological divergence. As the authors observe, divergence will vary within an inverted region. This can be traced to myriad factors, including variation in mutation rate, variation in constraint, patterns of ancestral polymorphism within this region, and variation in gene conversion within the inversion. Given this, I do not think it is valid to interpret the regions of high differentiation as the causal drivers of the ecological differentiation. In my view, the results need to be reframed in the context of ecological adaptation dependent on large inversions and potentially interesting candidate genes, without a claim that the results can be narrowed to individual genes.
We agree that divergence peaks can be traced to many different factors. We were not intending to make the claim that all genes underlying these divergence peaks are identical with the genes underlying ecological divergence. But we do think that the divergent genes are good candidates for being ecologically relevant. In the end, we have tempered our wording – as suggested by the editor and the reviewer – and now speak of “potential candidate genes” which “may” point to specific molecular mechanisms underlying ecological divergence (lines 286-290; 300-306).
We don’t think that we can make the alternative claim suggested by the reviewer, namely that adaptation is dependent on the large inversions. Firstly, the inversions cover the largest part of the genome so that some QTL will necessarily overlap with them. Secondly, except for chr1 there is no differentiation of inversions between ecotypes. We have given the FST values now in Figure 3 (previously Figure 2) in order to underline there is no differentiation. We have therefore not followed this suggestion.
3) The authors imply in the discussion based on historical results that the ecotype evolved in situ in the last ~60 years. This speculation should be removed as it seems much more likely that the ecotype was either missed in sampling or a recent migrant introduced it to the area in the recent past.
We were not intending to imply that the ecotype necessarily evolved in situ. We have therefore re-written the entire paragraph to include an alternative time-frame (postglacial, i.e. roughly 10,000 years) and to explicitly refer to the possibility of introgression (while we explicitly do not refer to evolution in situ). See lines 363-368.
https://doi.org/10.7554/eLife.82825.sa2Article and author information
Author details
Funding
European Research Council (802923)
- Tobias S Kaiser
Max-Planck-Gesellschaft
- Tobias S Kaiser
The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication. Open access funding provided by Max Planck Society.
Acknowledgements
Tjorben Nawroth, Kerstin Schäfer, Susanne Mentz, Elke Bustorf and Sina Schirmer provided laboratory assistance. Jürgen Reunert provided animal care. We thank all members of the research group Biological Clocks, as well as Diethard Tautz for feedback and discussion. This work was supported by the Max Planck Society via an independent Max Planck Research Group and by an ERC Starting Grant (No 802923) awarded to TSK. CP was funded by the International Max Planck Research School (IMPRS) for Evolutionary Biology.
Senior and Reviewing Editor
- Detlef Weigel, Max Planck Institute for Biology Tübingen, Germany
Version history
- Received: August 18, 2022
- Preprint posted: September 1, 2022 (view preprint)
- Accepted: February 7, 2023
- Version of Record published: February 28, 2023 (version 1)
- Version of Record updated: April 3, 2023 (version 2)
Copyright
© 2023, Briševac et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
-
- 255
- Page views
-
- 41
- Downloads
-
- 1
- Citations
Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.
Download links
Downloads (link to download the article as PDF)
Open citations (links to open the citations from this article in various online reference manager services)
Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)
Further reading
-
- Ecology
- Genetics and Genomics
Adulis, located on the Red Sea coast in present-day Eritrea, was a bustling trading centre between the first and seventh centuries CE. Several classical geographers--Agatharchides of Cnidus, Pliny the Elder, Strabo-noted the value of Adulis to Greco--Roman Egypt, particularly as an emporium for living animals, including baboons (Papio spp.). Though fragmentary, these accounts predict the Adulite origins of mummified baboons in Ptolemaic catacombs, while inviting questions on the geoprovenance of older (Late Period) baboons recovered from Gabbanat el-Qurud ('Valley of the Monkeys'), Egypt. Dated to ca. 800-540 BCE, these animals could extend the antiquity of Egyptian-Adulite trade by as much as five centuries. Previously, Dominy et al. (2020) used stable istope analysis to show that two New Kingdom specimens of P. hamadryas originate from the Horn of Africa. Here, we report the complete mitochondrial genomes from a mummified baboon from Gabbanat el-Qurud and 14 museum specimens with known provenance together with published georeferenced mitochondrial sequence data. Phylogenetic assignment connects the mummified baboon to modern populations of Papio hamadryas in Eritrea, Ethiopia, and eastern Sudan. This result, assuming geographical stability of phylogenetic clades, corroborates Greco-Roman historiographies by pointing toward present-day Eritrea, and by extension Adulis, as a source of baboons for Late Period Egyptians. It also establishes geographic continuity with baboons from the fabled Land of Punt (Dominy et al., 2020), giving weight to speculation that Punt and Adulis were essentially the same trading centres separated by a thousand years of history.
-
- Genetics and Genomics
Generating specific, robust protective responses to different bacteria is vital for animal survival. Here, we address the role of transforming growth factor β (TGF-β) member DBL-1 in regulating signature host defense responses in Caenorhabditis elegans to human opportunistic Gram-negative and Gram-positive pathogens. Canonical DBL-1 signaling is required to suppress avoidance behavior in response to Gram-negative, but not Gram-positive bacteria. We propose that in the absence of DBL-1, animals perceive some bacteria as more harmful. Animals activate DBL-1 pathway activity in response to Gram-negative bacteria and strongly repress it in response to select Gram-positive bacteria, demonstrating bacteria-responsive regulation of DBL-1 signaling. DBL-1 signaling differentially regulates expression of target innate immunity genes depending on the bacterial exposure. These findings highlight a central role for TGF-β in tailoring a suite of bacteria-specific host defenses.