1. Genetics and Genomics
Download icon

Genome-wide mapping in a house mouse hybrid zone reveals hybrid sterility loci and Dobzhansky-Muller interactions

  1. Leslie M Turner
  2. Bettina Harr  Is a corresponding author
  1. Max Planck Institute for Evolutionary Biology, Germany
  2. University of Wisconsin, United States
Research Article
  • Cited 57
  • Views 3,282
  • Annotations
Cite this article as: eLife 2014;3:e02504 doi: 10.7554/eLife.02504


Mapping hybrid defects in contact zones between incipient species can identify genomic regions contributing to reproductive isolation and reveal genetic mechanisms of speciation. The house mouse features a rare combination of sophisticated genetic tools and natural hybrid zones between subspecies. Male hybrids often show reduced fertility, a common reproductive barrier between incipient species. Laboratory crosses have identified sterility loci, but each encompasses hundreds of genes. We map genetic determinants of testis weight and testis gene expression using offspring of mice captured in a hybrid zone between M. musculus musculus and M. m. domesticus. Many generations of admixture enables high-resolution mapping of loci contributing to these sterility-related phenotypes. We identify complex interactions among sterility loci, suggesting multiple, non-independent genetic incompatibilities contribute to barriers to gene flow in the hybrid zone.


eLife digest

Different species have often evolved from a common ancestor. In order to become distinct species, however, the different groups of descendants of that ancestor must have become isolated from one another at some point in their history so that they could no longer mate or reproduce. For example, a mountain or a river might create a physical barrier that keeps species apart, so that if the species meet up again they may struggle to mate or produce offspring. Furthermore, any ‘hybrid’ offspring that are produced may themselves struggle to survive or successfully reproduce.

Examining the genes of the hybrid offspring that result when two recently separated species crossbreed could help us to understand how new species evolve. However, the challenges of finding enough suitable hybrids to compare means that few studies have so far investigated the genetic changes that occur to make reproduction between separate species difficult.

Two subspecies of the house mouse—Mus musculus musculus and Mus musculus domesticus—live alongside each other in a region of central Europe and can mate and produce hybrid offspring. Male hybrid mice are commonly less fertile than non-hybrids; this acts as a barrier to reproductive success that helps to maintain the separation between the two subspecies.

Turner and Harr captured wild hybrid mice, bred them in the laboratory, and studied their offspring. This strategy enabled them to measure fertility in mice very similar to wild-caught hybrids, but now all individuals can be measured at the same age and under the same environmental conditions.

A method called a genome-wide association study can be used to survey the genes of individuals with a particular disease or physical characteristic in an effort to identify gene variants that are associated with that condition. In many species, the weight of a male's testes has been linked to their fertility—small testes mean the male is likely to be less fertile. Changes in how genes are ‘expressed’ in the testes can also reduce fertility.

Turner and Harr used a genome-wide association study to investigate which genetic changes are linked to changes in testis weight or how genes in the testes are expressed in the offspring of hybrid mice. This revealed that many separate genetic regions are involved; including some that had not previously been identified. Turner and Harr then examined how these gene regions interact with each other. With the exception of one gene, all interacted with at least one of the other genetic regions that had been identified, forming a complex network of interactions.

Although a genome-wide association study reveals which genes are altered in hybrid mice with small testes, it does not reveal which of these genes actually cause the changes in testis size and gene expression. However, the work of Turner and Harr greatly narrows down the candidates for further investigation.



New species arise when reproductive barriers form, preventing gene flow between populations (Coyne and Orr, 2004). Recently, two approaches have substantially advanced the understanding of the genetic mechanisms underlying reproductive isolation (Reviewed in Noor and Feder (2006); Reviewed in Wolf et al. (2010)). Genetic crosses in the laboratory involving model organisms have identified loci and genes causing hybrid defects, a common type of reproductive barrier caused by genetic interactions between divergent alleles at two or more loci (Bateson, 1909; Dobzhansky, 1937; Muller, 1942). In nature, recent technological advances enable fine-scale characterization of genome-wide patterns of divergence between incipient species and variation in hybrid zones.

For example, ‘islands of divergence’ have been reported in species pairs from taxonomically diverse groups (Turner et al., 2005; Nadeau et al., 2011; Nosil et al., 2012; Ellegren et al., 2013; Hemmer-Hansen et al., 2013; Renaut et al., 2013; Carneiro et al., 2014; Poelstra et al., 2014; Schumer et al., 2014). These high-divergence genomic outlier regions are sometimes referred to as ‘islands of speciation’, resistant to introgression because they harbor genes causing reproductive isolation. However, other forces can create similar genomic patterns, thus islands may not always represent targets of selection that contributed to speciation (Noor and Bennett, 2009; Turner and Hahn, 2010; Renaut et al., 2013; Cruickshank and Hahn, 2014).

An alternative approach to identify genomic regions contributing to reproductive isolation is to map known reproductive barrier traits in naturally hybridizing populations. The potential for mapping in hybrid zones is long-recognized (Kocher and Sage, 1986; Harrison, 1990; Szymura and Barton, 1991; Briscoe et al., 1994; Reviewed in Rieseberg and Buerkle (2002)). Hybrid zones are ‘natural laboratories for evolutionary studies’ (Hewitt, 1988), enabling investigation of speciation in progress. The Dobzhansky-Muller model predicts that hybrid incompatibilities between incipient species accumulate faster than linearly with time (Orr, 1995), thus investigating taxa in the early stages of speciation facilitates identification of incompatibilities that initially caused reproductive isolation vs incompatibilities that arose after isolation was complete.

Despite these advantages, few studies have mapped barrier traits or other fitness-related traits in nature, due to the logistical challenges of collecting dense genome-wide genetic markers in species with admixed populations and well-characterized phenotypes. Examples include associations between pollen sterility and genomic regions showing reduced introgression in a sunflower hybrid zone (Rieseberg et al., 1999) and loci contributing to variation in male nuptial color and body shape mapped in a recently admixed stickleback population (Malek et al., 2012).

House mice (Mus musculus) are a promising model system for genetic mapping in natural populations (Laurie et al., 2007) and have an abundance of genetic tools available to ultimately isolate and characterize the causative genes underlying candidate loci. Three house mouse subspecies—M. m. musculus, M. m. domesticus, and M. m. castaneus–diverged ∼500,000 years ago from a common ancestor (Reviewed in Boursot et al. (1993); Salcedo et al. (2007); Geraldes et al. (2008)). M. m. musculus and M. m. domesticus (hereafter, musculus and domesticus) colonized Europe through different geographic routes and meet in a narrow secondary contact zone running through central Europe from Bulgaria to Denmark (Sage et al., 1986; Boursot et al., 1993). Genome-wide analyses of patterns of gene flow in several geographically distinct transects across the hybrid zone have identified genomic regions showing reduced introgression, which may contribute to reproductive isolation (Tucker et al., 1992; Macholan et al., 2007; Teeter et al., 2008, 2010; Janousek et al., 2012).

Reduced male fertility is common in wild-caught hybrids (Albrechtová et al., 2012; Turner et al., 2012) and in musculusdomesticus hybrids generated in the laboratory (Britton-Davidian et al., 2005; Reviewed in Good et al. (2008a)), implying that hybrid sterility is an important barrier to gene flow in house mice. Mapping studies using F1 and F2, hybrids generated from laboratory crosses between house mouse subspecies have identified many loci and genetic interactions contributing to sterility phenotypes (Storchova et al., 2004; Good et al., 2008b; White et al., 2011; Dzur-Gejdosova et al., 2012; Turner et al., 2014). Prdm9, a histone methyltransferase, was recently identified as a gene causing F1 hybrid sterility and is the first mammalian hybrid incompatibility gene identified (Mihola et al., 2009). Comparisons between different F1 crosses show that hybrid sterility alleles are polymorphic within subspecies (Britton-Davidian et al., 2005; Good et al., 2008a). Furthermore, reduced fertility phenotypes observed in nature vary in severity; complete sterility, as documented in some F1 crosses, appears to be rare or absent in the hybrid zone (Albrechtová et al., 2012; Turner et al., 2012). Taken together, studies of hybrid sterility in house mice indicate that, even in the early stages of speciation, the genetic basis of hybrid defects can be complex. Studies of gene flow in the hybrid zone and of hybrid sterility in the laboratory both have advantages and have shed light on the speciation process. Mapping sterility phenotypes in natural hybrids can potentially integrate insights from the two approaches by identifying associations between hybrid incompatibility loci and reduced gene flow across the hybrid zone.

In this study, we map sterility-related phenotypes in hybrid zone mice to investigate the genetic architecture of reproductive isolation between incipient species. We performed a genome-wide association study (GWAS) to map testis weight and testis gene expression in 185 first generation lab-bred offspring of wild-caught hybrid mice (Figure 1—figure supplement 1). GWAS have been powerful in humans, loci contributing to hundreds of quantitative traits associated with disease and other phenotypic variation has been identified (Reviewed in Stranger et al. (2011)). Examples of GWAS for fitness-related traits in non-humans are only beginning to emerge (Johnston et al., 2011; Filiault and Maloof, 2012; Magwire et al., 2012).

Our hybrid zone GWAS identified genomic regions associated with variation in relative testis weight (testis weight/body weight) and genome-wide testis expression pattern, including regions previously implicated in hybrid sterility as well as novel loci. Motivated by the Dobzhansky–Muller genetic model of hybrid defects, we tested for genetic interactions (Dobzhansky–Muller interactions—‘DMIs’) between loci affecting testis weight or expression pattern. All loci except one showed evidence for interaction with at least one partner locus and most interact with more than one partner. The deviations in phenotype associated with most interactions were large–affected individuals have phenotypes below the range observed in pure subspecies–suggesting that these interactions indeed are hybrid incompatibilities. To our knowledge, this is the first GWAS for a reproductive barrier trait. Using natural hybrids provided high mapping resolution that will facilitate future studies to identify causative genes; for example, a majority of GWAS regions contain 10 or fewer genes. Moreover, this study provides the first genome-scale description of a hybrid incompatibility network in nature.


Sterility-associated phenotypes

We investigated two phenotypes in males from the house mouse hybrid zone: relative testis weight (testis weight/body weight) and genome-wide testis gene expression pattern. Both of these phenotypes have previously been linked to hybrid male sterility in studies of offspring from crosses between musculus and domesticus and mice from the hybrid zone (Britton-Davidian et al., 2005; Rottscheidt and Harr, 2007; Reviewed in Good et al. (2008a), (2010); White et al. (2011); Turner et al. (2012), (2014)). We refer to these as ‘sterility phenotypes’, following conventional terminology in the field, however, it is important to note that the severity of defects observed in most hybrid zone mice is consistent with reduced fertility/partial sterility (Turner et al., 2012).

Testis expression PC1 (explaining 14.6% of the variance) is significantly correlated with relative testis weight (cor = 0.67, P = 2 × 10−16), indicating that there is a strong association between those two sterility phenotypes (Figure 1—figure supplement 2). Principal component 2 (PC2, 8.1% variance) is strongly correlated with hybrid index (% musculus autosomal SNPs: cor = 0.75, P = 2 × 10−16), thus the effect of hybrid defects does not obscure subspecies differences in expression.

In the mapping population, 19/185 (10.2%) individuals had relative testis weight below the minimum observed in pure subspecies males and 21/179 (11.7%) individuals had expression PC1 scores below (PC1 = −46.97) the pure subspecies range.

Association mapping

We identified four SNPs significantly associated with relative testis weight in three regions on the X chromosome using stringent thresholds determined by permutation (Table 1; Figure 1A). An additional 51 SNPs were significant using a more permissive significance threshold (false discovery rate (FDR) < 0.1). Significant SNPs were clustered in 12 genomic regions (of size 1 bp to 13.3 Mb; Table 1). We report GWAS regions defined using the permissive FDR threshold because we plan to combine mapping results from multiple phenotypes to identify candidate sterility loci, based on the idea that spurious associations are unlikely to be shared among phenotypes. Significant regions were located on the X chromosome and 9 autosomes, suggesting a minimum of 10 loci contribute to variation in testis weight. It is difficult to estimate the precise number of genes involved, because the extent of linkage disequilibrium (LD) of significant SNPs around a causative mutation depends on the phenotypic effect size, recombination rate, allele frequency, and local population structure. Multiple significant regions might be linked to a single causative mutation, or conversely, a significant region might be linked to multiple causative mutations in the same gene or in multiple genes.

Table 1

Genomic regions significantly associated with relative testis weight

Region*1ChrPosition (Mb)Length (kb)Sig. SNPs (5% perm)No. sign SNPs expression§Interactions#Concordant PC1 regionConcordant sterility loci**Sterile Allele††No. genes (coding) ‡‡Candidate Genes§§
RTW011173.30–173.3440.7115PC03d3 (3)
RTW02233.152.6104PC04BHZm*1 (1)
RTW032129.59–129.6559.8112PC08TWAd2 (1)
RTW05964.40103U1 (1)
RTW071237.16–41.524364.2447PC29D20 (12)Arl4aEFG
RTW091756.68–58.441752.2428PC43SCbinA; TWA; BHZM42 (39)Acsbg2E; ClppG; SafbG; Tmem146EG
RTW10X12.171 (1)14PC46ASHD; eQTLHSC; HTA; SCAm*1 (1)
RTW11X85.13–98.4313294.335 (2)353PC49ASHD; DBTA; eQTLHSC; FERTB; HTA; PBTA; SCB; TASA; TWBD; BHZD191 (67)ArEFG; ArxG; Pcyt1bEFG; Tex11EFG; ZfxEFG
RTW12X127.57–134.136555.54 (1)42PC50ASHD; eQTLHSC; shPC1A; SCD; TWD; BHZD158 (71)Nxf2G; Taf7lEFG
  1. *

    Significant SNPs <10 Mb apart were combined into regions.

  2. Significant intervals were defined by positions of the most proximal and distal SNPs with LD > 0.9 to a significant SNP.

  3. The number of SNPs significant at FDR < 0.1 is reported; number of significant SNPs significant with <0.05 P value in permutations is in parentheses.

  4. §

    Number of significant SNPs enriched for associations with transcripts expressed on another chromosome (P < 0.05; FDR < 0.1; >30 transcripts).

  5. #

    Number of regions with significant interactions.

  6. Overlapping regions significant for expression PC1 (see Table 2).

  7. **

    Sterility QTL overlapping or within 10 Mb from A(White et al. 2011), B(Dzur-Gejdosova et al. 2012), C(Turner et al. 2014), D(Good et al. 2008b). Abbreviations for phenotypes: ASH: abnormal sperm head morphology, TW: testis weight, SC: sperm count, shPC1: sperm head shape PC1, eQTLHS: trans eQTL hotspot, FERT: fertility, PBT: proximal bent sperm tail, HT: headless/tailless sperm, DBT: distal bent sperm tail, TAS: total abnormal sperm. BHZ: overlapping candidate regions with evidence from epistasis in the Bavarian hybrid zone transect (Janousek et al. 2012).

  8. ††

    Sterile allele inferred on the basis of frequency of a majority of significant SNPs in pure subspecies samples: D–domesticus; M–musculus; lower-case indicates FST< 0.7 between pure subspecies; * indicates overlapping PC1 region is D sterile; U–nondiagnostic SNP and/or no majority allele.

  9. ‡‡

    Number of genes (protein-coding) overlapping region.

  10. §§

    Genes with roles in male reproduction on the basis of Emale reproduction gene ontology terms (see ‘Materials and methods’) or phenotypes of knockout models reported in F(Matzuk and Lamb 2008) or GMGI database.

Table 1—source data 1

Protein-coding genes in significant relative testis weight regions.

Figure 1 with 2 supplements see all
Manhattan plot of GWAS results.

Single SNPs associated with (A) relative testis weight, (B) testis expression principal component 1, and (C) expression of transcripts located on other chromosomes (trans). Dashed lines indicate significance thresholds based on: permutations for autosomes (labeled 5% perm A), permutations for X chromosome (labeled 5% perm X), false discovery rate <0.1 (labeled 10% FDR), and 95th percentile of significant transcript association counts across SNPs (labeled 95%).

Figure 1—source data 1

SNPs significantly associated with relative testis weight and/or testis expression PC1 (excel file).


We identified 104 SNPs on the X and chromosome 1 significantly associated with expression PC1 using stringent permutation-based thresholds (Table 2; Figure 1B). An additional 349 SNPs were significant with the more permissive threshold of FDR < 0.1. Significant SNPs clustered in 50 genomic regions located on 18 autosomes and the X.

Table 2

Genomic regions significantly associated with testis expression PC1

Region*ChrPosition (Mb)Length (kb)Sig. SNPs (5% perm)No. sign SNPs expression§Interactions#Concordant RTW regionConcordant sterility loci**Sterile Allele††No. genes (coding) ‡‡Candidate Genes§§
PC0118.01–12.724715.24446BHZU40 (18)Mybl1GH
PC031166.84–185.8318988.228 (1)2547RTW01BHZD297 (229)Adcy10FGH; Atp1a4H; Ddr2GH; DeddH; Exo1FGH; F11rG; H3f3aGH; LbrH; Lmx1aH; MaelFH; MpzH; Vangl2H
PC04221.72–49.0127288.0303045RTW02TWA; BHZD604 (334)Acvr2aGH; BmycH; Grin1H; Il1rnI; Lhx3H; Notch1F; Nr5a1GH; Nr6a1F; Odf2FH; Pax8GH; Sh2d3cH; Sohlh1GH; StrbpFGH; Tsc1H
PC05267.001138TWAd1 (0)
PC06284.56–84.68125.61138eQTLHSC; TWAd8 (7)
PC072114.21–116.792579.47741eQTLHSC; TWA; BHZD20 (4)
PC082129.59–129.6559.81116RTW03TWAd2 (1)
PC09363.61–63.625.52236DBTAM1 (1)
PC1143.14–11.168023.38844D98 (31)Ccne2H; Chd7H; Plag1H
PC13537.811140m1 (1)
PC1465.78–5.90121.61128BHZd1 (1)
PC1577.09–7.109.01124shPC1Ad1 (1)
PC16735.471121shPC1Ad1 (1)
PC177140.36–140.98620.93343D8 (7)
PC18837.561122STAAd1 (1)
PC19874.15–74.1720.91133STAAd1 (1)
PC20890.23–106.7716539.65345STAA; BHZD146 (101)Bbs2GH; Ccdc135F; Csnk2a2GH; Katnb1H; Nkd1FH
PC218118.11–120.562451.02240STAAU23 (16)
PC23957.23–60.593359.65441D69 (54)2410076I21RikF; Bbs4GH; Cyp11a1GH
PC271167.99–69.471479.71131shPC1AD67 (46)AurkbH; Odf4F; ShbgF; Trp53H
PC28127.85–16.138278.4191947D54 (32)ApobFGH; Gdf7GH; Pum2H
PC291228.99–54.2225238.3464447RTW07BHZD150 (93)AhrGH; Arl4aGH; Immp2lFGH; Slc26a4H
PC321429.53–32.212675.55443STAA; TWBD44 (35)ChdhH; Dnahc1G; TktH
PC331466.74–75.018274.92241SCBU98 (71)Fndc3aFGH; Gnrh1GH; Npm2F; Piwil2FGH; Rb1H
PC3414121.69–121.7783.21131d1 (1)
PC351527.75–31.463701.35547HTA; TASAD19 (8)
PC361545.671136HTA; TASAd0
PC371573.001127eQTLHSCd1 (1)
PC38168.18–18.5110329.1565641BHZD201 (132)Prm1FGH; Prm2FGH; Prm3F; Ranbp1H; Rimbp3H; Rpl39lF; Snai2H; Spag6FGH; Tnp2FGH; Top3bI; Tssk1FH; Tssk2FH
PC411690.92–90.9311.61135STAAd1 (1)
PC421711.05–11.18132.73343eQTLHSC; FERTB; SCAB; TWABDh1 (1)
PC431742.08–63.2921217.1131145RTW09SCA; TWA; BHZMd272 (209)Acsbg2F; ClppH; DazlFGH; Klhdc3F; Mea1F; Pot1bH; SafbH; Sgol1F; Tcte1H; Tdrd6H; Tmem146F; Ubr2FGH; Zfp318H
PC441777.34–83.596248.82233TWAD53 (36)
PC451944.82–45.74918.110946BHZD23 (16)BtrcGH; DpcdH
PC46X11.34–19.347995.319 (7)1944RTW10ASHD; eQTLHSC; HTA; SCAD; TWD; BHZD82 (21)
PC48X68.03–70.772742.24 (1)343ASHAE; DBTA; eQTLHSC; FERTB; HTA; OFFE; PBTA; SCBE; TASA; TWBE; BHZU62 (41)Cetn2F; Mtm1H
PC49X83.62–108.5324911.7125 (84)12545RTW11DBTA; eQTLHSC; FERTB; HTA; PBTA; shPC1A; SCB; TASA; TWBD; BHZD407 (142)ArGH; ArxH; Atp7aH; Pcyt1bFGH; Tex11FGH; TsxH; ZfxFGH
PC50X127.01–137.3710365.121 (11)2145RTW12ASHD; eQTLHSC; shPC1A; SCD; TWD; BHZD212 (92)Nxf2H; Taf7lFGH; Tsc22d3H
  1. *

    Significant SNPs <10 Mb apart were combined into regions.

  2. Significant intervals were defined by positions of the most proximal and distal SNPs with LD > 0.9 to a significant SNP.

  3. The number of SNPs significant at FDR < 0.1 is reported; number of significant SNPs significant with <0.05 P value in permutations is in parentheses.

  4. §

    Number of significant SNPs enriched for associations with transcripts expressed on another chromosome (P < 0.05; FDR < 0.1; >30 transcripts).

  5. #

    Number of regions with significant interactions.

  6. Overlapping regions significant for relative testis weight (see Table 1).

  7. **

    Sterility QTL overlapping or within 10 Mb from A(White et al. 2011), B(Dzur-Gejdosova et al. 2012), C(Turner et al. 2014), D(Good et al. 2008b), E(Storchova et al. 2004). Abbreviations for phenotypes: ASH: abnormal sperm head morphology, TW: testis weight, SC: sperm count, shPC1: sperm head shape PC1, eQTLHS: trans eQTL hotspot, STA: seminiferous tubule area, FERT: fertility, PBT: proximal bent sperm tail, HT: headless/tailless sperm, DBT: distal bent sperm tail, TAS: total abnormal sperm, OFF: number of offspring. BHZ: overlapping candidate regions with evidence from epistasis in the Bavarian hybrid zone transect (Janousek et al. 2012).

  8. ††

    Sterile allele inferred on the basis of frequency of a majority of significant SNPs in pure subspecies samples: D–domesticus; M–musculus; lower-case indicates FST< 0.7 between pure subspecies; * indicates overlapping PC1 region is D sterile; U–nondiagnostic SNP and/or no majority allele; Dh–two SNPs with domesticus sterile alleles, one SNP heterozygous genotype shows sterile pattern; Md–majority musculus sterile alleles but some SNPs diagnostic domesticus sterile alleles.

  9. ‡‡

    Number of genes (protein-coding) overlapping region.

  10. §§

    Genes with roles in male reproduction on the basis of Fmale reproduction gene ontology terms (see ‘Materials and methods’) or phenotypes of knockout models reported in G(Matzuk and Lamb 2008) or HMGI database.

Table 2—source data 1

Protein-coding genes in significant testis expression PC1 regions.


To gain further insight into associations between sterility and gene expression, we mapped expression levels of individual transcripts. A total of 18,992/36,323 probes showed significant associations with at least one SNP. We focused on trans associations (SNP is located on different chromosome from transcript), based on evidence from a study in F2 hybrids that trans expression QTL (eQTL) are associated with sterility while cis eQTL are predominantly associated with subspecies differences (Turner et al., 2014). To identify SNPs significantly enriched for trans associations with expression, we used a threshold set at the 95% percentile of significant probe association counts across all SNPs (i.e., SNPs that showed associations with at least 30 transcripts, Figure 1C).

There was substantial overlap between mapping results for testis weight and expression PC1; 48/55 SNPs significant for relative testis weight (9 regions) were also significant for expression PC1. A permutation test, performed by randomly shuffling the positions of GWAS regions in the genome, provides strong evidence that this overlap is non-random (p < 0.0001, 10,000 permutations). Most SNPs significant for testis weight and/or expression PC1 were significantly enriched for trans associations with individual transcripts (relative testis weight: 49/55 SNPs, 8/12 regions; PC1: 440/453 SNPs, 50/50 regions). The combined mapping results provide multiple lines of evidence for contributions of all 50 PC1 regions and 9/12 testis weight regions. The three testis-weight regions (RTW04, RTW05, RTW08) not significantly associated with testis expression phenotypes are more likely to be spurious and are weaker candidates for future study.

Genetic interactions

Power to identify pairwise epistasis in GWAS for quantitative traits is limited even with very large sample sizes, due to multiple testing issues (e.g. Marchini et al., 2005). The Dobzhansky-Muller model predicts that the effect of each hybrid defect gene depends on interaction with at least one partner locus. Hence, for hybrid sterility traits, there is a hypothesis-driven framework in which to limit tests for epistasis to a small subset of possible interactions.

We tested for genetic interactions between all pairs of significant SNPs (FDR < 0.1) located on different chromosomes for testis weight and for expression PC1. We identified 142 significant pairwise interactions for relative testis weight, representing 22 pairs of GWAS regions (Figure 2A). These results provide evidence for a minimum of 13 autosomal–autosomal and five X–autosomal interactions affecting testis weight.

Figure 2 with 1 supplement see all
Significant GWAS regions and interactions in hybrid zone mice.

Results for (A) relative testis weight and (B) testis expression principal component 1 in hybrid zone mice. In (A) orange and yellow boxes in outer rings (outside grey line) indicate quantitative trait loci (QTL) identified for testis weight and other sterility phenotypes in previous studies (see Table 1 for details). Green boxes indicate significant GWAS regions for relative testis weight. Green lines represent significant genetic interactions between regions; shade and line weight indicate the number of significant pairwise interactions between SNPs for each region pair. In (B) orange boxes in outer rings indicate QTL for testis-related phenotypes (testis weight and seminiferous tubule area) identified in previous studies, yellow boxes indicate QTL for other sterility phenotypes and red boxes indicate trans eQTL hotspots (see Table 2 for details). Green boxes indicate significant GWAS regions for relative testis weight. Purple boxes indicate significant GWAS regions for testis expression PC1. Lines represent significant genetic interactions between regions; color and line weight—as specified in legend—indicate the number of significant pairwise interactions between SNPs for each region pair. Plot generated using circos (Krzywinski et al., 2009).

Figure 2—source data 1

Significant genetic interactions (SNP pairs) for relative testis weight (excel file).

Figure 2—source data 2

Significant genetic interactions (SNP pairs) for testis expression PC1 (excel file).


We identified 44,145 significant interactions between SNPs for expression PC1. The 913 GWAS region pairs provide evidence that at least 144 autosomal–autosomal interactions and 18 X–autosomal interactions contribute to expression PC1 (Figure 2B).

Effect size

We used deviations from population means for single SNPs and two-locus genotypes to estimate the phenotypic effects of GWAS regions and interactions (Figure 3A,B).

Phenotypic effects of testis-weight loci and interactions.

Histograms showing maximum deviations from the population mean for (A) single SNPs and (B) two-locus interactions. Dashed vertical lines indicate minimum values observed in pure subspecies males. (C) Examples of phenotypic means by two-locus genotype for autosomal–autosomal and X–autosomal interactions. Genotypes are indicated by one letter for each locus: D—homozygous for the domesticus allele, H—heterozygous, M—homozygous musculus.


As expected, interactions had greater effects, on average, than single loci for both phenotypes (relative testis weight: single locus mean = −1.81 mg/g, interaction mean = −4.07 mg/g; expression PC1: single locus mean = −81.51, interaction mean = −130.77). We provide examples of autosomal–autosomal and X–autosomal SNP pairs with significant interactions for each phenotype in Figure 3C. It is important to note that mean deviations are rough estimates of effect sizes, which don’t account for family structure.

It is possible that some of the GWAS regions we mapped contribute to quantitative variation within/between subspecies, rather than hybrid defects. The lowest genotypic means for most interactions fell below the range observed in pure subspecies (relative testis weight: 19/22 (86.3%) region pairs; expression PC1: 877/913 (96%) region pairs; Figure 3A,B), consistent with the hypothesis that interactions represent Dobzhansky–Muller incompatibilities.

Mapping simulations

We performed simulations to assess the performance of the mapping procedure for different genetic architectures by estimating the power to detect causative loci and the false positive rate (Figure 4). We simulated phenotypes based on two-locus genotypes from the SNP dataset using genetic models for nine genetic architecture classes (i.e. autosomal vs X linked, varied dominance) with parameters based on the observed distribution of relative testis weight (Figure 4—figure supplement 1, Figure 4—source data 1).

Figure 4 with 2 supplements see all
Mapping power in simulations.

Each panel illustrates results from a single genetic architecture model for (A) 100 autosomal–autosomal SNP pairs and (B) 100 X—autosomal SNP pairs. Each point represents the percentage of data sets generated from a single SNP pair in which locus 1 (domesticus sterile allele; green), locus 2 (musculus sterile allele; purple), or both loci (orange) were identified by association mapping (≥1 SNP significant by permutation based threshold within 10 Mb of ‘causal’ SNP). The x axis indicates the percentage of individuals with partial or full sterility phenotypes. Curves were fit using second order polynomials. In (A), locus 1 indicates the SNPs with musculus alleles sterile and locus 2 indicates the SNPs with domesticus alleles sterile. In (B), locus 1 is the X-linked SNP and locus 2 is the autosomal SNP.

Figure 4—source data 1

Z scores for simulation models.


The distribution of distances to the causal SNP for all significant SNPs located on the same chromosome (Figure 4—figure supplement 2) shows that the majority of significant SNPs (62.7%) are within 10 Mb of the causal SNP, however, a small proportion of significant SNPs are >50 Mb from the causal SNP. In most cases, causal SNPs detected at long distances also had significant SNPs nearby, for example 83.4% of loci with significant SNPs 1–10 Mb distant also have significant SNPs within 1 Mb. These results provide support for our choice to define significant GWAS regions by combining significant SNPs within 10 Mb, and suggest that these regions are likely to encompass the causative gene.

As expected, the power to detect one or both causative loci depended on the location (autosomal vs X-linked), dominance, and frequency of both ‘causative’ alleles (Figure 4, Table 3). For example, the mean percentage of simulations for which both loci were detected (SNP < 10 Mb significant by permutation-based threshold) was six times higher (14.4%) for the X chromosome × autosomal-dominant architecture compared to the autosomal-recessive × autosomal-recessive architecture (2.6%). The relationship between power and the proportion of affected individuals in the mapping population was complex. Interestingly, power was high for some simulations with very few affected individuals. In these cases, the few individuals carrying the lower frequency sterility allele by chance also carried the sterility allele from the second locus, thus the average single allele effects were not diminished by individuals carrying one but not both interacting sterility alleles.

Table 3

Results of mapping simulations

Locus 1 detected,Locus 2 detected,§Both loci detectedMean No. Sig. SNPs
Architecture*Med. Distance to Causal SNP (Mb) X chromosome/Autosome0.2 Mb1 Mb10 Mb0.2 Mb1 Mb10 Mb0.2 Mb1 Mb10 Mb10 Mb#50 Mb#Diff. Chr.
Permutation P<0.05
FDR <0.1
  1. *

    Architecture abbreviations: add–additive; dom–dominant; rec–recessive.

  2. Locus 1 for autosomal pairs is musculus sterile allele; locus 1 for X-autosomal pairs is X-linked.

  3. 3‘detected’–≥1 significant SNP within given distance criterion.

  4. §

    Locus 2 for autosomal pairs has a domesticus sterile allele; locus 2 for X-autosomal pairs is autosomal.

  5. #

    Mean number significant SNPs within distance criterion for either locus.

  6. Mean number significant SNPs on chromosomes not containing ‘causal’ SNPs.

It is important to note that our empirical results suggest that the two-locus models used to simulate phenotypes are overly simplified. We predict that involvement of a sterility locus in multiple incompatibilities would reduce the influence of allele/genotype frequencies of any single partner locus on power.

To estimate the false discovery rate from simulations, we classified significant SNPs not located on the same chromosome as one of the causative SNPs as false positives. Choosing an appropriate distance threshold for false vs true positives on the same chromosome was not obvious given the distribution of distances to causal SNPs (Figure 4-figure supplement 2). We classified significant SNPs <50 Mb from causative SNPs as true positives and excluded SNPs >50 Mb when calculating FDR. Using permutation-based significance thresholds, the median false positive rate was 0.014 (calculated for simulations with ≥10 SNPs within 50 Mb of either causative locus). These results suggest that significant SNPs from the GWAS identified using this more stringent threshold are likely to be true positives. By contrast, the median false positive rate was 0.280 using the FDR < 0.1 threshold, indicating this threshold is more permissive than predicted. Thus, there is a substantial chance that SNP associations with relative testis weight and expression PC1 identified using this threshold are spurious and evidence is weak for GWAS regions comprising one SNP significantly associated with a single phenotype.


Genetic mapping of testis weight and testis gene expression in hybrid zone mice implicated multiple autosomal and X-linked loci and a complex set of interactions between loci. These results provide insight into the genetic architecture of a reproductive barrier between two incipient species in nature.

Association mapping in natural hybrid populations

The potential to leverage recombination events from generations of intercrossing in hybrid zones to achieve high-resolution genetic mapping of quantitative traits has been recognized for decades (Reviewed in Rieseberg and Buerkle, (2002)). Until recently, collection of dense genotype datasets and large sample sizes has not been feasible in natural populations due to logistics and costs. This study demonstrates that loci and genetic interactions contributing to reproductive barrier traits can be identified in a GWAS with a modest sample size (see also related study mapping craniofacial phenotypes in this mapping population Pallares et al. 2014). Sample sizes approximating those used for human GWAS are not necessary if the prevalence and genetic architecture of the trait of interest are favorable. In general, epistasis makes genetic mapping more difficult. However, for hybrid defects, dependence of the phenotype on epistasis conversely may facilitate mapping. Despite substantial deleterious effects in hybrids, incompatibility alleles are not subject to negative selection within species and may be at high frequency or fixed within species. Hence, the prevalence of affected individuals in a hybrid zone for epistatic traits may be much higher than for deleterious traits in pure populations (e.g. disease in humans).

Combining mapping of multiple sterility-related phenotypes substantially improved power to identify sterility loci. We identified a few loci for each phenotype using stringent significance thresholds based on permutation. In addition, most loci identified using more permissive thresholds showed significant associations with more than one phenotype. Spurious associations are unlikely to be shared across phenotypes, thus evidence from multiple phenotypes provided confidence for contributions of nine genomic regions to testis weight (on the X and 5 autosomes) and 50 genomic regions to expression PC1 (on the X and 18 autosomes).

The high resolution of mapping in the hybrid zone provides an advantage over laboratory crosses. For example, significant regions identified here (median = 2.1 Mb, regions with defined intervals) are much smaller than sterility QTL identified in F2s (35.1 Mb; White et al., 2011). Many GWAS regions contain few enough genes that it will be possible to individually evaluate the potential role of each in future studies to identify causative genes. For example, 8/12 testis-weight regions and 28/50 expression PC1 regions contain 10 or fewer protein-coding genes.

We identified candidate genes with known roles in reproduction in four testis-weight regions and 17 expression PC1 regions (Tables 1–2). However, for the majority of regions (8/12 relative testis weight, 33/50 expression PC1), there are no overlapping/nearby genes previously linked to fertility. It is unlikely that these regions would be prioritized if contained in large QTL intervals. High resolution mapping is possible using mapping resources such as the collaborative cross (Aylor et al., 2011) and heterogeneous stocks (Svenson et al., 2012), but these populations represent a small proportion of genetic diversity in house mice (Yang et al., 2011) and hybrid incompatibility alleles may have been lost during strain production.

Polymorphism of hybrid male sterility loci

Comparisons of different F1 crosses between strains of domesticus and musculus have shown that hybrid sterility phenotypes and loci depend on the geographic origins of parental strains (Britton-Davidian et al., 2005; Good et al., 2008a), suggesting that most hybrid sterility alleles are segregating as polymorphisms within subspecies. Several of the loci identified in this study of hybrid zone mice are novel, providing additional evidence that sterility alleles are polymorphic within subspecies. However, a majority of loci we identified in natural hybrids are concordant with previously identified sterility QTL (Tables 1–2, Figure 2). This similarity suggests that there are common genetic factors underlying hybrid sterility in house mice, although there was no statistical support that genome-wide patterns of overlap with previous studies for testis weight or expression PC1 were non-random (p > 0.05, 10,000 permutations).

Prdm9, discovered by mapping F1 hybrid sterility, is the only characterized hybrid sterility gene in mice (Mihola et al., 2009). None of the GWAS regions identified here overlap Prdm9 (chromosome 17, 15.7 Mb). However, one expression PC1 region (PC42) is ∼4 Mb proximal to Prdm9. Reductions in PC1 are observed in individuals that are heterozygous or homozygous for the domesticus allele at PC42. This pattern is partially consistent with sterility caused by Prdm9, which occurs in heterozygous individuals carrying sterile alleles from domesticus (Dzur-Gejdosova et al., 2012; Flachs et al., 2012). We did not find evidence for significant associations between SNPs near Prdm9 and testis weight; the nearest GWAS region (RTW09) is ∼41 Mb distal and low testis-weight is associated with the musculus allele.

There is concordance between some of the genetic interactions between loci identified here and interactions identified by mapping sterility phenotypes and testis expression traits in an F2 cross between musculus and domesticus (White et al., 2011; Turner et al., 2014) (Figure 2—figure supplement 1). Precise overlap between some GWAS regions and interaction regions from F2s identifies strong candidates for future studies to identify the causative mechanisms and genes underlying sterility loci. For example, an interaction between chromosome 12 and the central X chromosome (RTW11, PC49) identified for testis weight and expression PC1 overlaps an interaction affecting testis expression in F2 hybrids (Turner et al., 2014). The 4.3 Mb interval of overlap among chromosome 12 loci (RTW07, PC29, 32.38–41.43 Mb F2s) encompasses 12 protein-coding genes, including a gene with a knockout model showing low testis weight and sperm count (Arl4a) (Schurmann et al., 2002), and two genes with roles in regulating gene expression (Meox2, Etv1).

We compared the positions of GWAS regions to 182 regions (163 autosomal, 19 X-linked) with evidence for epistasis based on a genome-wide analysis of genomic clines in a transect across the house mouse hybrid zone in Bavaria (Janousek et al., 2012), the same region where the progenitors of the mapping population were collected. Five testis-weight regions and 18 expression-PC1 regions overlap candidate regions from the hybrid zone genomic clines analysis (Tables 1–2), however, the patterns of overlap were not statistically significant (p > 0.05, 10,000 permutations). Future introgression analyses using high-density markers within and around GWAS regions may be useful in identifying causative genes and estimating the contributions of sterility alleles to reduced gene flow.

Role of the X chromosome

Three GWAS regions associated with testis weight and five expression PC1 regions are located on the X chromosome. The X-chromosomal regions surpass the stringent permutation-based significance thresholds and thus have strong statistical support. These results are consistent with evidence for an important role for the X in hybrid sterility from laboratory crosses between subspecies strains geographically diverse in origin (Guenet et al., 1990; Elliott et al., 2001; Oka et al., 2004, 2007; Storchova et al., 2004; Good et al., 2008a, 2008b; Mihola et al., 2009; White et al., 2012) and evidence for greatly reduced gene flow of X-linked loci across the European hybrid zone (Tucker et al., 1992; Payseur et al., 2004; Macholan et al., 2007; Teeter et al., 2008, 2010). A disproportionately large contribution of the X chromosome is a common feature of reproductive isolation in many taxa, the so-called ‘large X effect’ (Coyne and Orr, 1989).

The musculus derived X chromosome has been implicated repeatedly in genetic studies of sterility in F1 and F2 hybrids (Reviewed in Good et al. (2008a); White et al. (2011)). By contrast, domesticus alleles were associated with the sterile pattern for most loci we identified on the X in hybrid zone mice (Tables 1–2). A testis expression-QTL mapping study performed in F2s also showed that domesticus ancestry in the central/distal region of the X was associated with a sterile expression pattern (Turner et al., 2014). Differences between studies might reflect geographic variation in sterility alleles, but identification of domesticus-sterile X alleles only in generations beyond the F1 suggests that interactions with recessive autosomal partner loci are essential. The importance of recessive sterility alleles was demonstrated previously by the discovery of multiple novel recessive loci in an F2 mapping study (White et al., 2011). F1 hybrids are essentially absent in nature (Teeter et al., 2008; Turner et al., 2012) because the hybrid zone is ≥30 km wide (Boursot et al., 1993), thus pure subspecies individuals rarely encounter each other. Consequently, recessive autosomal loci acting in F2 and advanced generation hybrids contribute to the maintenance of reproductive isolation in the hybrid zone and may have played important roles in its establishment.

Genetic architecture of hybrid sterility

Despite a growing list of sterility loci and genes identified in a variety of animal and plant taxa, there are few cases of Dobzhansky–Muller incompatibilities for which all partner loci are known (Phadnis, 2011). Hence, there remain many unanswered questions about the genetic architecture of hybrid defects. For example, how many incompatibilities contribute to reproductive barriers in the early stages of speciation? How many partner loci are involved in incompatibilities? Are these patterns consistent among taxa?

The interactions contributing to sterility phenotypes we mapped in hybrid zone mice reveal several general features of the genetic architecture of hybrid sterility. Most sterility loci interact with more than one partner locus. This pattern is consistent with evidence from studies mapping sterility in F1 musculusdomesticus hybrids (Dzur-Gejdosova et al., 2012) and mapping interactions affecting testis gene expression in F2 hybrids (Turner et al., 2014). We did not have sufficient power to map interactions requiring three or more sterility alleles, but interactions between alleles from the same subspecies imply their existence. Loci causing male sterility in Drosophila pseudoobscura Bogota–USA hybrids also have multiple interaction partners; seven loci of varying effect size interact to cause sterility (Phadnis, 2011). In hybrids between Drosophila koepferae and Drosophila buzzatii, sterility is associated with many loci of small effect, consistent with a polygenic threshold model (Moran and Fontdevila, 2014).These studies suggest that biological pathways/networks are often affected by multiple Dobzhansky–Muller interactions; a single pairwise interaction between incompatible alleles disrupts pathway function enough to cause a hybrid defect phenotype, but when more incompatible alleles are present the effects of multiple pairwise interactions are synergistic. Variation in the effect sizes of sterility loci might then reflect variation in the number of networks in which the gene is involved and the connectedness/centrality of the gene within those networks.

Characteristics of the incompatibility network are important for generating accurate models of the evolution of reproductive isolation. A ‘snowball effect’—faster-than-linear accumulation of incompatibilities caused by epistasis—is predicted on the basis of the Dobzhansky–Muller model (Orr, 1995; Orr and Turelli, 2001). Patterns of accumulation of hybrid incompatibilities in Drosophila and Solanum provide empirical support for the snowball hypothesis (Matute et al., 2010; Moyle and Nakazato, 2010). Because most GWAS regions have many interaction partners, our results are not consistent with the assumption of the snowball model that incompatibilities are independent, suggesting that network models of incompatibilities (Johnson and Porter, 2000; Porter and Johnson, 2002; Johnson and Porter, 2007; Palmer and Feldman, 2009) may be more accurate for understanding the evolution of reproductive barriers in house mice.

Involvement of hybrid sterility loci in interactions with multiple partner loci also has important implications for understanding the maintenance of the hybrid zone. Because deleterious effects of a sterility allele are not dependent on a single partner allele, the marginal effect of each locus and thus visibility to selection are less sensitive to the allele frequencies at any single partner locus in the population.

Identifying and functionally characterizing incompatibility genes is an important goal in understanding speciation but is unrealistic in most non-model organisms. By contrast, mapping reproductive isolation traits in natural populations to identify the number and location of loci and interactions is feasible. General features of the genetic architecture of hybrid sterility—the number of incompatibilities and number and effect size of interacting loci—are arguably more likely to be shared among organisms than specific hybrid sterility genes. Comparison of these features among taxa may reveal commonalities of the speciation process.

Materials and methods

Mapping population

Request a detailed protocol

The mapping population includes first-generation lab-bred male offspring of mice captured in the hybrid zone (Bavaria) in 2008 (Turner et al., 2012) (Figure 1—figure supplement 1). We included 185 mice generated from 63 mating pairs involving 37 unrelated females and 35 unrelated males. Many dams and sires were used in multiple mating pairs, thus our mapping population includes full siblings, half siblings, and unrelated individuals. Most mating pairs (53 pairs, 149 offspring) were set up with parents originating from the same or nearby trapping locations. Eleven pairs (36 offspring) include dams and sires originating from more distant trapping locations; phenotypes of these offspring were not reported in Turner et al. (2012).


Request a detailed protocol

Males were housed individually after weaning (28 days) to prevent effects of dominance interactions on fertility. We measured combined testis weight and body weight immediately after mice were sacrificed at 9–12 weeks. We calculated relative testis weight (testis weight/body weight) to account for a significant association between testis weight and body weight (Pearson's correlation = 0.29, p = 4.9 × 10−5).

We classify individuals with relative testis weight below the range observed in pure subspecies as showing evidence for sterility (Turner et al., 2012). To confirm that this is an appropriate threshold for inferring hybrid defects, we compared this value to relative testis weights reported previously for offspring from intraspecific and interspecific crosses (Good et al., 2008a). The pure subspecies minimum we observed is substantially lower (>2 standard deviations) than means for males from intraspecific crosses (converted from single relative testis weight: musculusPWK × musculusCZECH − mean = 10.2, standard deviation = 1.2; domesticusLEWES x domesticusWSB − mean = 11.0, standard deviation = 1.0) and comparable to (within 1 standard deviation) values observed in F1 hybrids from 4/7 interspecific crosses that showed significant reductions (mean plus one standard deviation 4.6–9.2 mg/g).

Testis gene expression

Request a detailed protocol

We measured gene expression in testes of 179 out of the 185 males from the mapping population. Freshly dissected testes were stored in RNAlater (Qiagen, Hilden, Germany) at 4°C overnight, then transferred to −20°C until processed. We extracted RNA from 15–20 mg whole testis using Qiagen RNeasy kits and a Qiagen Tissue Lyser for the homogenization step. We verified quality of RNA samples (RIN >8) using RNA 6000 Nano kits (Agilent) on a 2100 Bioanalyzer (Agilent, Waldbronn, Germany).

We used Whole Mouse Genome Microarrays (Agilent) to measure genome-wide expression. This array contains 43,379 probes surveying 22,210 transcripts from 21,326 genes. We labeled, amplified, and hybridized samples to arrays using single-color Quick-Amp Labeling Kits (Agilent), according to manufacturer protocols. We verified the yield (>2 μg) and specific activity (>9.0 pmol Cy3/μg cRNA) of labeling reactions using a NanoDrop ND-1000 UV-VIS Spectrophotometer (NanoDrop, Wilimington, DE, USA). We scanned arrays using a High Resolution Microarray Scanner (Agilent) and processed raw images using Feature Extraction Software (Agilent). Quality control procedures for arrays included visual inspection of raw images and the distribution of non-uniformity outliers to identify large spatial artifacts (e.g. caused by buffer leakage or dust particles) and quality control metrics from Feature Extraction protocol GE1_QCMT_Dec08.

We mapped the 41,174 non-control probe sequences from the Whole Mouse Genome Microarray to the mouse reference genome (NCBI37, downloaded March 2011) using BLAT ((Kent, 2002); minScore = 55, default settings for all other options). Probes with multiple perfect matches, more than nine imperfect matches, matches to non-coding/intergenic regions only, or matches to more than one gene were excluded. A total of 36,323 probes (covering 19,742 Entrez Genes) were retained.

We preformed preprocessing of microarray data using the R package Agi4x44PreProcess (Lopez-Romero, 2009). We used the background signal computed in Feature Extraction, which incorporates a local background measurement and a spatial de-trending surface value. We used the ‘half’ setting in Agi4x44PreProcess, which sets intensities below 0.5 to 0.5 following background subtraction and adds an offset value of 50. Flags from Feature Extraction were used to filter probes during preprocessing (wellaboveBG = TRUE, isfound = TRUE, wellaboveNEG = TRUE). We retained probes with signal above background for at least 10% of samples. We used quantile normalization to normalize signal between arrays. Expression data were deposited in Gene Expression Omnibus as project GSE61417.

To identify major axes of variation in testis expression, we performed a principal components analysis using prcomp in R (R Development Core Team 2010) with scaled variables.


Request a detailed protocol

We extracted DNA from liver, spleen, or ear samples using salt extraction or DNeasy kits (Qiagen). Males from the mapping population were genotyped using Mouse Diversity Genotyping Arrays (Affymetrix, Santa Clara, CA) by Atlas Biolabs (Berlin, Germany).

We called genotypes at 584,729 SNPs using apt-probeset-genotype (Affymetrix) and standard settings. We used the MouseDivGeno algorithm to identify variable intensity oligonucleotides (VINOs) (Yang et al., 2011); 53,148 VINOs were removed from the dataset. In addition, we removed 18,120 SNPs with heterozygosity >0.9 in any population because these SNPs likely represent additional VINOs. We performed additional filtering steps on SNPs included in the dataset used for mapping. We only included SNPs with a minor allele frequency >5% in the mapping population. SNPs without a genome position or with missing data for >15% of the individuals in the mapping population or pure subspecies reference panel were removed. We pruned the dataset based on linkage disequilibrium (LD) to reduce the number of tests performed. LD pruning was performed in PLINK (Purcell et al., 2007; Purcell n.d.) using a sliding window approach (30 SNPs window size, 5 SNPs step size) and a VIF threshold of 1 x 10−6 (VIF = 1/(1−R2), where R2 is the multiple correlation coefficient for a SNP regressed on all other SNPs simultaneously). This procedure essentially removed SNPs in perfect LD. These filtering steps yielded 156,204 SNPs.

Ancestry inference

Request a detailed protocol

To identify ancestry-informative SNPs, we compared genotypes from 21 pure M. m. domesticus individuals (11 from Massif Central, France and 10 from Cologne/Bonn, Germany) and 22 M. m. musculus individuals (11 from Námest nad Oslavou, Czech Republic and 11 from Almaty, Kazakhstan) (Staubach et al., 2012).

We used Structure (Pritchard et al., 2000; Falush et al., 2003) to graphically represent the genetic composition of our mapping population (Figure 1—figure supplement 1). We included one diagnostic SNP per 20 cM, 3–5 markers/chromosome totaling 60 SNPs genome wide. We used the ‘admix’ model in Structure and assumed two ancestral populations.

Association mapping

Request a detailed protocol

To identify genomic regions significantly associated with relative testis weight and testis gene expression, we used a mixed model approach to test for single SNP associations. Admixture mapping—often applied in studies using samples with genetic ancestry from two distinct populations—was not appropriate for this study because it was not possible to account for relatedness among individuals in the mapping population (Buerkle and Lexer, 2008; Winkler et al., 2010).

We performed association mapping using GEMMA (Zhou and Stephens, 2012), which fits a univariate mixed model, incorporating an n x n relatedness (identity-by-state) matrix as a random effect to correct for genetic structure in the mapping population. We estimated relatedness among the individuals in the mapping population in GEMMA using all markers and the –gk 1 option, which generates a centered relatedness matrix. For each single SNP association test we recorded the Wald test p value. Phenotypes tested include relative testis weight (testis weight/body weight, RTW), testis expression principal component 1 (PC1, 14.6% variance, associated with fertility, Figure 1—figure supplement 2), and normal quantile ranks of gene expression values for individual transcripts. Neither RTW nor expression PC1 was significantly correlated with age at phenotyping (RTW: cor = −0.02, p = 0.72; PC1: cor = 0.01, p = 0.90), thus we did not include age in the model. SNP data, phenotypic data, and kinship matrix to run GEMMA area are available through Dryad at: doi:10.5061/dryad.2br40.

To account for multiple testing, we first determined stringent significance thresholds by permutation. We randomized phenotypes among individuals 10,000 times, recording the lowest p value on the X and the lowest p value on any autosome for each permutation. Thresholds set to the fifth percentile across permutations for RTW were 5.73 × 10−7 (autosomes) and 5.83 × 10−5 (X chromosome); thresholds for expression PC1 were 1.66 × 10−8 (autosomes) and 1.01 × 10−5 (X chromosome). Next, we identified regions using a more permissive significance threshold based on the 10% false discovery rate (Benjamini and Hochberg, 1995), equivalent to p = 3.49 × 10−5 for RTW and p = 2.86 × 10−4 for expression PC1.

To estimate the genomic interval represented by each significant LD-filtered SNP, we report significant regions defined by the most distant flanking SNPs in the full dataset showing r2 > 0.9 (genotypic LD, measured in PLINK) with each significant SNP. We combined significant regions <10 Mb apart into a single region.

Testing for genetic interactions

Request a detailed protocol

Identifying genetic interactions using GWAS is computationally and statistically challenging. To improve power, we reduced the number of tests performed by testing for interactions only among significant SNPs (FDR < 0.1) identified using GEMMA. We tested all pairs of significant SNPs located on different chromosomes for each phenotype (692 pairs RTW, 82,428 pairs expression PC1). To account for relatedness among individuals we used a mixed model approach, similar to the model implemented in GEMMA. We used the lmekin function from the coxme R package (Therneau, 2012) to fit linear mixed models including the identity-by-state kinship matrix as a random covariate. We report interactions as significant for SNP pairs with p < 0.05 and FDR < 0.1 for interaction terms (RTW: FDR < 0.1 ∼ p < 0.02; expression PC1: FDR < 0.09 ∼ p < 0.05).

Mapping simulations

Request a detailed protocol

We performed simulations to evaluate the performance of our mapping approach under varying genetic architectures and allele frequencies. We simulated phenotypes using several genetic models of two-locus epistasis and parameters based on the empirical distribution of relative testis weight. The simulation procedure is illustrated in Figure 4—figure supplement 1. To preserve genetic structure, we simulated phenotypes using two-locus genotypes from the SNP dataset.

We tested 100 autosomal–autosomal SNP pairs (SNPs on different chromosomes) and 100 X–autosomal pairs (50 with domesticus X-linked sterile alleles and 50 with musculus X-linked sterile alleles). The criteria used for choosing ‘causative’ SNPs were a minor allele frequency >0.05 in the mapping population and fixed in at least one subspecies. The ‘sterile’ allele could be polymorphic or fixed within subspecies but the alternate ‘non-sterile’ allele had to be fixed within the other subspecies—e.g. domesticus sterile alleles have frequencies 0.05–1.0 in the domesticus reference populations from France and Germany and the alternate allele at those SNPs are fixed in musculus samples from the Czech Republic and Kazakhstan. For each pair, the ‘causative’ SNPs were randomly selected from all SNPs meeting those criteria (144,506 possible domesticus sterile, 124,390 possible musculus sterile).

For each SNP pair, we modeled all possible combinations of recessive, additive, and dominant sterile alleles. For each model type, we assigned mean Z scores for each possible two-locus genotype (Figure 4—source data 1). The magnitude of the most severe phenotype (−2.3 standard deviations) is based on observed relative testis weights in the most severely affected males. The mean Z score for heterozygotes in additive models was −1.15. Mean Z scores for non-sterile genotypes in the models were randomly drawn from a uniform distribution between −0.5 and 0.5.

For each SNP pair/architecture, 100 data sets were generated by drawing phenotypes (Z scores) for each individual from a normal distribution with the appropriate two-locus mean and standard deviation = 0.75. The standard deviation value, equivalent to 2.98 mg/g, was chosen on the basis of standard deviations in pure subspecies samples from the mapping population (domesticus = 2.13, musculus = 3.65; (Turner et al., 2012)). This value is higher than standard deviations in intraspecific F1 males (domesticusLEWES × domesticusWSB = 1.2, musculusPWK × musculusCZECH = 1.0; (Good et al., 2008a)), suggesting estimates of mapping power may be conservative.

In total, 90,000 simulations were performed, (9 architectures × 100 SNP pairs × 100 data sets). We identified significant SNPs for each data set using GEMMA, as described above for the empirical data.

Significance of overlap between candidate sterility loci

Request a detailed protocol

We used permutations to test for non-random co-localization of candidate sterility loci from this study and previous QTL and hybrid zone studies. The locations of significant GWAS regions for relative testis weight and expression PC1 were randomized 10,000 times using BEDTools (Quinlan and Hall, 2010). To assess overlap between significant regions for the two phenotypes, we counted the number of RTW regions overlapping PC1 regions (and vice versa) for each permutation. To test for overlap between GWAS identified regions and previously reported candidate regions for related phenotypes, we counted the number of permuted regions overlapping the positions of the published regions (fixed) for each replicate. GWAS regions for both phenotypes were compared to genomic regions with evidence for epistasis and reduced introgression in the Bavarian transect of the hybrid zone (Janousek et al., 2012). In addition, RTW regions were compared to testis weight QTL from mapping studies in F2 hybrids from crosses between subspecies (Storchova et al., 2004; Good et al., 2008b; White et al., 2011; Dzur-Gejdosova et al., 2012) and expression PC1 regions were compared to trans eQTL hotspots identified in F2 hybrids (Turner et al., 2014).

Gene annotation

Request a detailed protocol

We used ENSEMBL (version 66, February 2012) Biomart to download gene annotations for genomic regions significantly associated with relative testis weight. We identified candidate genes in significant regions with roles in male reproduction using reviews of male fertility (Matzuk and Lamb, 2008), manual searches, MouseMine searches for terms related to male fertility (http://www.mousemine.org/), and gene ontology (GO) terms related to male reproduction or gene regulation (plus children): meiosis GO:0007126; DNA methylation GO:0006306; regulation of gene expression GO:0010468; transcription GO:0006351; spermatogenesis GO:0007283; male gamete generation GO:0048232; gamete generation GO:0007276; meiotic cell cycle GO:0051321. Many genes with roles in reproduction reported in publications were not annotated with related GO terms, highlighting the limitations of gene ontology. Moreover, genes causing sterility might not have functions obviously related to reproduction.

Data availability

The following data sets were generated
    1. Turner L
    2. Harr B
    (2014) GWAS mapping genotype data
    Available at Dryad Digital Repository under a CC0 Public Domain Dedication.


    1. Bateson W
    Darwin and modern science
    Darwin and modern science, Cambridge:, Cambridge University Press.
    1. Benjamini Y
    2. Hochberg Y
    Controlling the false discovery rate: a practical and powerful approach to multiple testing
    Journal of the Royal Statistical Society Series B 57:289–300.
    1. Briscoe D
    2. Stephens JC
    3. Obrien SJ
    Linkage disequilibrium in admixed populations: applications in gene-mapping
    The Journal of Heredity 85:59–63.
  1. Book
    1. Coyne JA
    2. Orr HA
    Sunderland, Massachusetts: Sinauer Associates.
    1. Coyne JA
    2. Orr HA
    Speciation and its consequences
    Speciation and its consequences, Sunderland, Massachusetts, Sinauer Associates.
  2. Book
    1. Dobzhansky T
    Genetics and the origin of species
    New York: Columbia University Press.
    1. Falush D
    2. Stephens M
    3. Pritchard JK
    Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies
    Genetics 164:1567–1587.
    1. Good JM
    2. Handel MA
    3. Nachman MW
    Asymmetry and polymorphism of hybrid male sterility during the early stages of speciation in house mice
    Evolution 62:50–65.
    1. Harrison RG
    Hybrid zones: windows on evolutionary processes
    Oxford Surveys in Evolutionary Biology 7:69–128.
    1. Lopez-Romero P
    (2009) PreProcessing of Agilent 4x44 array data
    Agi4x44PreProcess, http://watson.nci.nih.gov/bioc_mirror/packages/release/bioc/html/Agi4x44PreProcess.html.
    1. Muller HJ
    Isolating mechanisms, evolution and temperature
    Biological Symposia 6:71–125.
    1. Orr HA
    The population genetics of speciation: the evolution of hybrid incompatibilities
    Genetics 139:1805–1813.
    1. Pritchard JK
    2. Stephens M
    3. Donnelly P
    Inference of population structure using multilocus genotype data
    Genetics 155:945–959.
    1. R Development Core Team.
    R: a Language and environment for statistical computing
    Vienna, Austria, R Foundation for Statistical Computing.
    1. Rieseberg LH
    2. Whitton J
    3. Gardner K
    Hybrid zones and the genetic architecture of a barrier to gene flow between two sunflower species
    Genetics 152:713–727.
    1. Sage RD
    2. Whitney JB III
    3. Wilson AC
    Genetic analysis of a hybrid zone between domesticus and musculus mice (Mus musculus complex) - hemoglobin polymorphisms
    Current Topics in Microbiology and Immunology 127:75–85.
    1. Therneau T
    coxme: Mixed effects Cox models
    1. Wolf J
    2. Lindell J
    3. Backstrom N
    (2010) Speciation genetics: current status and evolving approaches
    Philosophical Transactions of the Royal Society B: Biological Sciences 365:1717–1733.

Decision letter

  1. Gil McVean
    Reviewing Editor; Oxford University, United Kingdom

eLife posts the editorial decision letter and author response on a selection of the published articles (subject to the approval of the authors). An edited version of the letter sent to the authors after peer review is shown, indicating the substantive concerns or comments; minor concerns are not usually shown. Reviewers have the opportunity to discuss the decision before the letter is sent (see review process). Similarly, the author response typically shows only responses to the major concerns raised by the reviewers.

Thank you for sending your work entitled “Genome-wide mapping in a natural hybrid zone reveals hybrid male sterility loci and Dobzhansky-Muller interactions” for consideration at eLife. Your article has been favorably evaluated by Detlef Weigel (Senior editor) and 3 reviewers, one of whom is a member of our Board of Reviewing Editors.

The Reviewing editor and the other reviewers discussed their comments before we reached a decision. We all feel that the design and direction of your experiment are important innovations and the work has the potential to be important in the field. However, there are a number of substantive points that need to be addressed. Moreover, the reviewers all felt that there were multiple aspects of analysis, presentation and interpretation that need to be improved.

Below, you will find the full reviews. I should note that it is not standard practice at eLife to return the full reviews in addition to the editor's summary. However, in this case I think the range of comments is a useful reflection of the responses of readers and are likely to be helpful in revision.

In terms of revising, the following major points need to be addressed:

Analysis: Given the large effect of the X chromosome it seems critical to include X variants in the covariance matrix. We also believe that more could be made of analyses of the genetic architecture of the trait (e.g. contribution of individual chromosomes to variance in the trait using GCTA or similar software). Similarly the DMI model makes specific predictions about the direction of epistatic effects (combinations of derived alleles deleterious) which should be easy to address by polarising variants using an outgroup.

Interpretation: The phenotype studied is not hybrid sterility (despite the Title). There is actually no direct evidence for an association between the trait studied and fitness. Indeed the statement that there were no significant results for sperm count is a little worrying. However, there is limited evidence that the phenotypes of the hybrids are typically outside the range of normal variation within the species. It is important to note these caveats.

Presentation: Without fine-mapping or replication, the ability to identify/localise specific genes as being important in the trait is limited. For this reason, we feel the emphasis on long lists of rather weakly-supported candidate genes is misplaced. There are also issues with formatting, clarity of figures and citations. Please concentrate on these above points in your revision. The comments below should help you in understanding details of these issues.

Reviewer #1:

The genetic study of reproductive isolation and speciation has the potential to be transformed by the large-scale study of genetic variation in hybrid zones and mapping experiments. This paper presents the first (at least as far as I know) GWAS of a reproductive isolation phenotype (relative testis weight) between Mus musculus musculus and M. m. domesticus from a hybrid zone. The principle findings are:

1) There are multiple (estimate of 26) regions across the genome contributing to the phenotype.

2) Some of the identified regions overlap with (or are close to) regions identified as affecting reproductive traits from F2 crosses.

3) Many of the regions harbour loci that also affect gene expression in the testis.

4) There is some evidence for interactions between loci, consistent with the DMI model.

These findings are of broad interest to the field and an interesting counterpart to the mapping results, though perhaps not unexpected and I feel not presented in as compelling a way as possible (see below). To be compelling, I think there has to be a more rigorous approach to validating the findings.

Major points:

1) A standard GWAS approach, using linear mixed models to account for relatedness between individuals is used. However, while human GWAS studies typically use 'genome-wide significant' (a somewhat vague term, but implying a conservative threshold for significance of 5e-8) and require replication, the approach taken here is to use an FDR approach with a generous threshold of 10% and no replication. Permutation studies presented show that this threshold is somewhat permissive. More generally, there is no attempt made to either establish how well calibrated the FDR approach is, nor to validate findings through replication. To me this is a serious limitation, and while I understand the authors' desire to maximise the findings from the study, I think a more limited, but well-validated, set of loci would be both more credible and of wider value. In short, I think validation of association signals in a replication set of samples is needed.

2) I would like to know more about the data. For example, how does relative testis weight vary with age (mice were killed across a range of ages), does it differ between the parental species? Likewise, more could be made from chromosome-scale analyses of variance; e.g. using GCTA. Similarly, it would be interesting to attempt to estimate dominance and epistatic variance components. Also, the DMI model makes specific predictions about the direction of interactions. I would like to know what fraction of identified interactions fit with a model in which derived allele (inferred using an outgroup) interactions are deleterious (as opposed to alternative scenarios).

3) Although each is largely minor, there are many comments about the methodology used; justifying methods and parameter choices. I am particularly concerned about the exclusion of the X chromosome from the relatedness matrix, not least because it apparently has such a large estimated effect. This needs to be addressed.

Reviewer #2:

The manuscript by Turner and Harr reports the genome-wide mapping of relative testis-weight loci segregating in a natural hybrid zone of Mus m. musculus and Mus m. domesticus. They combined the high resolution Mouse Diversity Genotyping array for GWAS analysis with microarray-based gene expression analysis of the whole testis to identify the genomic loci potentially involved in reproductive isolation between both species. The logic of the experiments is difficult to follow without first reading their previous paper (Turner et al. Evolution 66:443). Even then, there are three main concerns: First and most important, the phenotype studied is complex and in respect to hybrid sterility rather vague. Testes weight (or relative TW) variation can reflect genetically determined variations without any defect, various defects in haploid phase of meiosis, intrameiotic arrests of various causes and various extent, or even a premeiotic block. If rTW is taken as the only phenotypic trait it is not surprising that the outcome is so complex and its interpretation necessarily difficult. If, at least, histological evaluation and, for example, sperm count were included, the information value would have been much higher.

Second, the use of the term “hybrid sterility loci” seems as a kind of overstatement, because none of the examined males proved to be sterile. In fact, Turner et al. 2011 Evolution paper states 'breeding data do show that hybrids have similar fertility to pure subspecies pairs' (text and Table 2). Moreover, their breeding scheme was designed to maximize the number of hybrid offspring. Thus a significant portion of variation in testes weight could reflect physiological intra- and inter-species QTLs. The argument that variation is much larger in hybrid males than in pure species is weakened by the low number of examined pure species males (See also their Evolution paper).

Third, the Dobrzhansky-Muller incompatibilities, which result in lowering testes weight can favor the involvement of such loci in reproductive isolation. The authors found 149 such chromosomal region pairs but their documentation is missing, perhaps with the exception of Figure 5, which, however, seems to show only expression DMIs (?).

In conclusion, the strength of the paper includes the first attempt to use GWAS on wild mice from a hybrid zone to infer the genetic networks involved in reproductive isolation of a young species pair. The main weak point is the inadequate phenotype selected for quantification and consequent overstatements in the interpretation of the obtained data.

Reviewer #3:

In this study, the authors use crosses among wild mice collected in the hybrid zone between M. m. musculus and M. m. domesticus to map genomic regions associated with reduced relative testis weight in hybrid males. This is a novel approach to identifying regions associated with hybrid male sterility in a system of general interest to the evolutionary genetics community. They found 26 regions across the genome, all of which interacted with at least one partner and most of which interacted with many. They also used GWAS to identify loci associated with variation in expression of testis-expressed genes. The approach seems appropriate and well executed.

My substantive comments are regarding organization, clarity and the interpretation of results. These should be remedied easily with a short round of editing.

The writing could be tighter and clearer, especially in the Abstract and Introduction. Be upfront about the advantages and disadvantages of the approach and clearly summarize the approach and major findings. Hybrid zone analyses reflect current processes in a zone of secondary contact that is relatively recent. Their significance to initial phases of reproductive isolation in allopatry is bolstered when there is overlap between this approach and QTL mapping studies. In addition, even if these specific regions were not critical in early phases, they give insight into the genetic architecture of traits that almost certainly were important.

I was often confused in the manuscript regarding which results were being referred to when e.g. which regions are being referred to in the sentence “All significant regions are involved in”?

Citations are a bit sloppy. For example, in the first paragraph of the Introduction, the authors posit that two approaches have recently substantially advanced understanding. I expected some citations of recent work and instead found only citations from Dobzhansky and Muller. Another example, the citations for “the long-recognized potential for mapping in hybrid zones...” contain only relatively recent papers. This could be easily remedied by adding “for review, see Reiseberg and Buerkle” given that that paper does review some of the older work in the system. There are many additional citations that should be considered for the statement that “islands may not always represent targets of selection...” All in all, I would suggest going through the manuscript more carefully and including citations more consistently and broadly.

More information about the system would also be useful in the Introduction, e.g. what are the three subspecies of house mice, how long have they been diverged, how old is the hybrid zone, etc. What have we learned previously? The Abstract does not actually give the names of the two subspecies.

Explain why there may and may not be overlap between loci uncovered using mapping in hybrid zones and mapping between allopatric populations.

I think this section could be re-organized to make the whole approach clearer. Explain motivation for insight into nature and timing of fertility defects. Why consider testis expression changes in the context of infertility? Give more info on what was actually done. You associated specific loci with variation in what measure of expression? Help the reader more clearly connect your results to specific insight (this refers to paragraph 1 and 2 in this section).

Candidate genes:

First sentence, identified genes from which of the previous analyses? Make the methods for identifying specific genes more explicitly methodical. First, we looked at all genes in the 26 regions with... Then we focused on the genes that were implicated in both this analysis and...

Success of GWAS:

This section seems a bit unclear. Is it GWAS that was very successful or using many different approaches? What is the success here-identification of relatively few locit or better characterization of the architecture? Be upfront about the possible downsides (environmental effects, phenotype characterization).

The entire simulation section needs more explanation and justification. Many tables and figures are devoted to this (many which could be supplemental) but the explanation is very slight. Insight into the importance of which factors in the simulations? There are three criteria for true or false positives-which ones correspond to something you considered true and which ones false? What do the results mean for the interpretation of results?

Genetic Architecture:

This section seems weak. What do you find? How does this compare with previous studies? What does this study in particular add? The reference to the Snowball effect is too slight to be effective.


Author response

Analysis: Given the large effect of the X chromosome it seems critical to include X variants in the covariance matrix. We also believe that more could be made of analyses of the genetic architecture of the trait (e.g. contribution of individual chromosomes to variance in the trait using GCTA or similar software). Similarly the DMI model makes specific predictions about the direction of epistatic effects (combinations of derived alleles deleterious) which should be easy to address by polarising variants using an outgroup.

Inclusion of X genotypes:

We repeated the mapping analysis, including X genotypes in the covariance matrix. The main effect of this change was a slightly more stringent P value cutoff to identify significant SNPs, which resulted in the identification of fewer genomic regions (26 regions excluding the X-chromosomal markers, 12 regions including the X-chromosomal markers). The positions of the regions stayed essentially the same.

Variance explained by individual chromosomes:

As suggested by reviewer 1, we used the software gcta to try and estimate the contribution of individual chromosomes to phenotypic variance (i.e. relative testis weight). This analysis has been pioneered on large GWAS datasets on human quantitative traits, such as height. We faced several problems with this analysis. The first problem is a conceptual one, the second a technical.

The model that gcta is fitting assumes that SNPs have additive effects on the trait. While this is very likely a good approximation for traits such as human height, it is not appropriate for hybrid sterility traits, because the Dobzhansky-Muller model predicts epistatic interactions between loci are necessary to observe hybrid defects. The additive genetic model would be adequate if we would map quantitative variation in testis weight within a subspecies, but not for the transgressive phenotypes in the hybrids between subspecies.

The technical problem arises when one deals with a highly structured population, such as our population from the hybrid zone. If there is an effect of population structure, SNPs on one chromosome will be correlated with the SNPs on the other chromosomes, hence estimates of variance explained individual chromosomes are overestimates. To correct for this, Yang et al. (2011) introduced a model where the genetic relatedness matrices of all the chromosomes are fitted jointly (the –mgrm flag in gcta) to estimate the variance explained by each of the chromosomes. When we applied this version of the model, the likelihoods did not converge, even after running the maximum number of iterations allowed by the software (10,000). The most likely explanation for this problem is that our sample size is too small (185 individuals). The recommended minimum sample size for genome-partitioning analysis is 5,000 (http://gcta.freeforums.net/thread/27/partioning-autosomes).

The results presented below were obtained by running the REML procedure on each chromosome separately, incorporating the first 10 principle components (Eigenvectors) calculated over the whole autosomal dataset to correct for population structure. Similar results were obtained when Eigenvectors were calculated from the respective chromosome to correct for population structure.

chromosomeV(G)/VpSEP value

The heritability estimates (i.e. the proportion of phenotypic variation due to additive genetic factors, V(G)/Vp) are not reliable due to the complications stated above. However, in both sets of analyses, chromosomes 2, 17 and X explained most of the variation. These results are consistent with our mapping analysis, which identified significant GWAS regions located on those chromosomes.

Ancestral vs. derived sterility alleles:

We agree that determining if sterility alleles are ancestral or derived would be of great interest. However, significant SNPs identified by GWAS are unlikely to include the causal mutations for mapped phenotypes. Without knowing the causal variant, it is not straightforward to categorize sterility alleles as ancestral or derived. Nevertheless, we compared genotypes in significant regions from M. m. musculus and M. m. domesticus populations to M. spretus, M. spicilegus and M. mattheyii. Each region comprises numerous SNPs that can be polarized and additional sites with shared polymorphisms between musculus and domesticus. Thus, it will not be possible to determine which alleles are ancestral vs. derived until future studies identify causal genes/mutations.

Interpretation: The phenotype studied is not hybrid sterility (despite the Title). There is actually no direct evidence for an association between the trait studied and fitness. Indeed the statement that there were no significant results for sperm count is a little worrying. However, there is limited evidence that the phenotypes of the hybrids are typically outside the range of normal variation within the species. It is important to note these caveats.

We agree that that there was a need for additional explanation of links between the mapped phenotype and hybrid sterility. We discuss this issue in detail now and note caveats in the “phenotyping” section of Methods, and the “sterility-associated phenotypes” and “effect size” sections of Results. In addition, we mapped another sterility-associated phenotype: testis expression PC1.

Presentation: Without fine-mapping or replication, the ability to identify/localise specific genes as being important in the trait is limited. For this reason, we feel the emphasis on long lists of rather weakly-supported candidate genes is misplaced. There are also issues with formatting, clarity of figures and citations.

We agree. We simplified annotation of the GWAS regions, reporting only genes with strong evidence for roles in reproduction (Tables 1-2) and removed most of the discussion of candidate genes from the text.

We made several major changes to the organization of the manuscript to improve clarity. First, we edited the Introduction, incorporating suggestions from reviewers to better explain the motivation and logic of the study. Second, we separated Results and Discussion and added more subheadings. Third, we reduced the number of figures in the main text to four.

Additional changes:

We made several changes in response to concerns of individual reviewers, notable examples include:

Significance thresholds: We clarified the motivation for reporting results using a permissive (FDR <0.1) threshold and using multiple lines of evidence to identify loci likely to be true positives. We report estimates of the false positive rate using stringent and permissive thresholds from simulations.

Tests for interactions:

Instead of testing for interactions with MCMCglmm, we used a mixed model approach similar to the GEMMA framework used to identify single-SNP associations.

Overlap between GWAS regions and sterility QTL:

We performed permutation tests to determine if associations between GWAS regions and previously reported candidate sterility regions were non-random. It was not possible to test for non-random concordance between genetic interactions identified in this study and those reported in F2 hybrids by permutation because individual GWAS regions have multiple partners and the number of possible pairwise interactions between SNPs varies widely across region pairs. In the absence of statistical support for the overlapping pattern, we decreased emphasis on this result by including the figure showing overlap as a supplemental figure.

Yang, J., et al. 2011. Genome partitioning of genetic variation for complex traits using common SNPs. Nat. Genet. 43:519–525.


Article and author information

Author details

  1. Leslie M Turner

    1. Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
    2. Laboratory of Genetics, University of Wisconsin, Madison, United States
    LMT, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
  2. Bettina Harr

    Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Plön, Germany
    BH, Conception and design, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    Competing interests
    The authors declare that no competing interests exist.


Deutsche Forschungsgemeinschaft (SFB-680)

  • Bettina Harr


  • Leslie M Turner
  • Bettina Harr

National Human Genome Research Institute (5T32HG002760)

  • Leslie M Turner

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.


We thank Xiang Zhou for expert support using GEMMA. We thank Bret Payseur for useful discussion and Diethard Tautz, Luisa Pallares, Trevor Price, Detlef Weigel, Jiri Forejt, Gil McVean and an anonymous reviewer for comments on the manuscript. LMT was supported by postdoctoral funding from the Max Planck Society (to D Tautz) and by a National Human Genome Research Institute (NHGRI) training grant in Genomic Sciences to the University of Wisconsin (NHGRI 5T32HG002760). Research funding for this project was provided by the Max Planck Society (to D Tautz) and the Deutsche Forschungsgemeinschaft (SFB-680 to BH).

SNP genotype data were deposited in doi:10.5061/dryad.2br40 and gene expression data in Gene Expression Omnibus GSE61417.


Animal experimentation: This study was performed according to approved animal protocols and institutional guidelines of the Max Planck Institute. Mice were maintained and handled in accordance to FELASA guidelines and German animal welfare law (Tierschutzgesetz § 11, permit from Veterinäramt Kreis Plön: 1401-144/PLÖ-004697).

Reviewing Editor

  1. Gil McVean, Oxford University, United Kingdom

Publication history

  1. Received: February 10, 2014
  2. Accepted: October 23, 2014
  3. Version of Record published: December 9, 2014 (version 1)


© 2014, Turner and Harr

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.


  • 3,282
    Page views
  • 416
  • 57

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Genetics and Genomics
    Megan Phifer-Rixey

    Hybrid mice shed new light on the interactions between regions of the genome that help drive the evolution of new species by reducing the fertility of hybrid males.

    1. Genetics and Genomics
    2. Microbiology and Infectious Disease
    Jessamyn I Perlmutter et al.
    Research Article

    Wolbachia are the most widespread bacterial endosymbionts in animals. Within arthropods, these maternally-transmitted bacteria can selfishly hijack host reproductive processes to increase the relative fitness of their transmitting females. One such form of reproductive parasitism called male killing, or the selective killing of infected males, is recapitulated to degrees by transgenic expression of the WO-mediated killing (wmk) gene. Here, we characterize the genotype-phenotype landscape of wmk-induced male killing in D. melanogaster using transgenic expression. While phylogenetically distant wmk homologs induce no sex-ratio bias, closely-related homologs exhibit complex phenotypes spanning no death, male death, or death of all hosts. We demonstrate that alternative start codons, synonymous codons, and notably a single synonymous nucleotide in wmk can ablate killing. These findings reveal previously unrecognized features of transgenic wmk-induced killing and establish new hypotheses for the impacts of post-transcriptional processes in male killing variation. We conclude that synonymous sequence changes are not necessarily silent in nested endosymbiotic interactions with life-or-death consequences.