Abstract
The long-term balancing selection acting on mating-types or sex determining genes is expected to lead to the accumulation of deleterious mutations in the tightly linked chromosomal segments that are locally “sheltered” from purifying selection. However, the factors determining the extent of this accumulation are poorly understood. Here, we take advantage of the large number of alleles at the sporophytic self-incompatibility system of the Brassicaceae along a complex dominance hierarchy to evaluate how the pace at which linked deleterious mutations accumulate varies with the intensity of balancing selection. We first experimentally measured the phenotypic manifestation of the linked load at three different levels of the dominance hierarchy. We then sequenced and phased polymorphisms in the chromosomal regions linked to 126 distinct copies of S-alleles in two populations of Arabidopsis halleri and three populations of A. lyrata. We find that linkage to the S-locus locally distorts phylogenies over about 10-30kb along the chromosome. The more intense balancing selection on dominant S-alleles results in greater fixation of deleterious mutations, while recessive S-alleles accumulate more linked deleterious mutations that are segregating. Hence, the structure rather than the overall magnitude of the linked genetic load differs between dominant and recessive S-alleles. Our results have consequences for the long-term evolution of new S-alleles, the evolution of dominance modifiers between them, and raise the question of why the non-recombining regions of some sex and mating type chromosomes expand over evolutionary times while others, such as that the S-locus of the Brassicaceae, remain restricted to small chromosomal regions.
Introduction
Sexual reproduction typically involves mating between individuals that belong to separate groups of reciprocal reproductive compatibility. This partition among conspecific individuals is obvious in species where the male and female reproductive functions are performed by distinct categories of individuals, but can also be present in purely hermaphroditic species, where compatibility among individuals is governed by the segregation of various numbers of mating types in the total absence of sexual specialisation. The existence of sexes or mating-types leads to one the strongest forms of long-term balancing selection, and is often associated with clusters of polymorphisms around sex/mating-type controlling regions kept together by structural rearrangements. In some cases, such rearrangements can span almost entire chromosomes (e.g. sex chromosomes in mammals 1 or mating-type chromosomes in ascomycete fungi 2), while in others they remain limited to relatively small genomic regions (e.g. chromosomal inversions controlling male reproductive morphs in the ruff3, mating-type loci in some basidiomycete fungi, segregating indels controlling pin vs. thrum floral morphs in Primula 4). The long-term balancing selection acting on these systems is expected to lead to the accumulation of deleterious mutations in the tightly linked chromosomal segments that are “sheltered” from purifying selection by the presence of the balanced polymorphism5,6. These deleterious mutations can have drastic short- and long-term consequences for the evolution of the species, and determining the processes by which they accumulate is crucial to understand how the rearranged regions can either expand along the chromosomes or conversely remain restricted to limited genomic tracts7,8.
Self-incompatibility (SI) is a genetic mechanism allowing recognition and rejection of self-pollen by hermaphrodite individuals, thereby preventing inbreeding and promoting outcrossing in hermaphroditic plant species 9. In the Brassicaceae family, SI is controlled by a single non-recombining chromosomal region, the S-locus10,11. SI is one of the most prominent examples of long-term balancing selection 12, and as such deleterious mutations are expected to accumulate in very close genetic linkage to the S-alleles because of the indirect effects of linked selection 6. Population genetics models predict that deleterious variants should accumulate within specific S-allele lineages12,13, and should then be reshuffled among them by recombination. However, due to the technical difficulty of phasing polymorphisms, this process has rarely been characterised in detail14.
A key feature of sporophytic SI systems, also shared by sex chromosomes, is the existence of dominance interactions between S-alleles. While most individuals are heterozygous at the S-locus and thus carry two different S-alleles, only one of them is generally expressed at the phenotypic level, especially for the pollen specificity, following a complex genetic dominance hierarchy 15,16. Evolutionary properties of S-alleles vary along the dominance hierarchy because balancing selection is predicted to act more strongly on dominant than on recessive S-alleles, as the latter are often masked at the phenotypic level 17. As a result, Llaurens et al. 13 and Goubet et al. 18 suggested that the dynamics of accumulation of deleterious variation may differ in close linkage with dominant vs recessive S-alleles, similarly to Y vs. X chromosomes, respectively. Specifically, recessive S-alleles can form homozygous combinations in natural populations more often than dominant S-alleles17, such that recombination can occur occasionally between distinct gene copies of the same recessive S-allele, providing the opportunity for the linked recessive deleterious mutations to be purged from within the S-locus itself. In addition, because recessive S-alleles reach higher population frequencies15,17, purifying selection on linked deleterious variants is expected to have higher efficacy among gene copies of recessive than dominant S-alleles. This is expected to result in a higher fixation probability of deleterious variants linked to the class of dominant S-alleles than to the class of recessive S-alleles13. Empirical support for this simple prediction has been conflicting, though. Based on phenotypic measurements in A. halleri, Llaurens et al. 13 observed a decrease of fitness associated by enforced homozygosity for one of the most dominant S-alleles (Ah15) but not for the most recessive S-allele of the allelic series (Ah01). In contrast, Stift et al. 19 observed no effect of dominance on the genetic load linked to three dominant vs . recessive S-alleles in a natural population of the closely related A. lyrata. Hence, the data available so far are inconclusive, but are restricted to very small numbers of S-alleles. They are also based on inherently limited phenotypic measurements, seriously limiting the power of the comparisons, and preventing proper generalisation of the effect of dominance on the accumulation of linked deleterious variation.
In this study, we combined phenotypic, genomic and theoretical approaches to finely dissect the patterns of accumulation of deleterious variation linked to the S-locus supergene in A. halleri and A. lyrata. We first extended the phenotypic approach of Llaurens et al. 13 to a series of additional S-alleles from the same local A. halleri population to evaluate the effect of S-allele dominance on the sheltered load. We then used parent-offspring trios and targeted genome re-sequencing to directly quantify the accumulation of putative deleterious mutations linked to phased dominant vs. recessive S-alleles in two A. halleri and three A. lyrata natural populations. Finally, we used stochastic simulations to refine the theoretical predictions about the effect of S-allele dominance on the dynamics of linked deleterious mutations. Overall, our results provide a more nuanced view of the effect of the intensity of balancing selection on the sheltered load, in which the structure of the sheltered load rather than its magnitude differs among S-alleles from different dominance classes.
Results
The genetic load linked to the S-locus varies among S-alleles, but is not correlated with dominance
We first expanded the experimental approach of Llaurens et al. 13 to phenotypically evaluate the effect of S-allele dominance on the intensity of the sheltered load. The previous study focused on three S-alleles (Ah01, Ah02 and Ah15) 13. Here we included two S-alleles from the same local population (Nivelle, France): Ah03 and Ah04, and included Ah01 again for comparative purposes. In the Arabidopsis genus, S-alleles have been shown to form a complex dominance hierarchy 15,16. This hierarchy is largely associated with the phylogeny of S-alleles15, and at least four phylogenetic classes (I, II, III and IV) have been described, from the most recessive (class I) to the most dominant of S-alleles (class IV). Dominance interactions also exist among S-alleles within classes, such that these five S-alleles form the following dominance hierarchy 15,16: Ah01<Ah03<Ah02<Ah04<Ah15, from the most recessive (Ah01) to the most dominant (Ah15). To reveal the linked load, we enforced homozygosity at the S-locus using controlled crosses between parental individuals sharing a given S-allele that was masked by different dominant S-alleles (e.g., to obtain Ah xAhx homozygotes we deposited pollen from a Ah xAhy plant, where Ahy>Ahx, on pistils of a Ah xAhz plant where z≠y, or on a AhxAhx pistil when available; table S1). We obtained 399 offspring from a total of six such crosses. Note that our experimental procedure differs slightly from that of Llaurens et al. 13 in that their procedure required a CO 2 treatment to bypass the SI system and obtain selfed offspring, while here we took advantage of the dominance interactions to obtain outcrossed S-locus homozygous individuals that we phenotypically compared to their full-sibs with S-locus heterozygous genotypes. Note also that the S-locus homozygous offspring we obtained contain distinct gene copies of a given S-allele lineage. Hence, they could in principle carry distinct suites of linked deleterious mutations in case these mutations segregate within S-allele lineages.
We first tested whether homozygosity at the S-locus affected survival by measuring for each cross the proportion of homozygotes at the S-locus reaching the reproductive stage for three S-alleles (in two replicate families per S-allele, Table S1). The proportion of Ah01/Ah01 and Ah04/Ah04 homozygotes surviving to the reproductive stage was consistent with mendelian expectations in their respective families. However, we observed a significant decrease of Ah03/Ah03 homozygotes at the reproductive stage compared with Mendelian expectations (Table 1), whereas the observed proportion of the Ah03 S-allele among heterozygous individuals did not depart from expectations (2/3=0.67; Table 1). Thus, the increased mortality is associated with Ah03 homozygosity, rather than with a lower performance of individuals carrying the Ah03 S-allele itself. Overall, a genetic load was thus observed linked to the Ah03 S-alleles, which is at an intermediate level of dominance, but neither to the most dominant (Ah04) nor to the most recessive (Ah01) S-allele. Hence, these observations do not support a positive correlation between S-allele dominance and the magnitude of the sheltered load.
Next we measured thirteen vegetative and reproductive traits in the resulting families and compared offspring that were homozygous for their S-alleles with their full sibs that were heterozygous (Fig. 1). We first used permutations to test whether the mean trait value of homozygotes differed from that in heterozygotes. Overall, with a single exception, we found no effect of homozygosity at the S-locus on variation of the traits measured (Fig. 1; Table S2). The maximum length of flowering stems was the exception to this general pattern, with longer reproductive stems for S-locus homozygous than heterozygous genotypes, hence in the opposite direction from our expectation of lower fitness in homozygotes. For this trait, there was significant variation among replicate families for homozygotes of the recessive allele Ah01 but not of the dominant allele Ah04 (Table S3). We then used generalised linear models (GLM) to evaluate the effect of dominance (as a fixed effect) on the mean phenotypic value of homozygotes compared to heterozygotes for each trait (Table S4; treating family of origin, attacks by phytopathogens, phytophagous and oxidative stress as random effects whenever necessary). We also observed no effect of S-allele dominance on the contrast between S-locus homozygotes and heterozygotes for any of these traits. A single of the thirteen traits was an exception to this general pattern, but again the effect was in the opposite direction from our expectation, with an earlier rather than delayed appearance of the first leaf for homozygotes of more dominant S-alleles; Table S4). Overall, our phenotypic results confirmed the presence of a detectable linked load on some phenotypic traits (survival; time to produce the first leaf), but we could not replicate the observation of Llaurens et al. 13 that dominant S-alleles carry a more severe deleterious load than recessive S-alleles, even though our samples were obtained from the same local population.
S-alleles are associated with specific sets of tightly linked mutations
The model of the sheltered load assumes that distinct S-allele lineages carry specific sets of linked deleterious mutations, but to our knowledge this prediction was never tested directly. We combined a parent-offspring trio approach with sequencing of the S-locus flanking regions to phase the mutations segregating in the S-locus flanking regions with their respective S-alleles. Briefly, we used a previously developed sequence capture protocol specifically targeting the nucleotide sequences over 75 kb on each side of the S-locus along with a series of 100 control regions from throughout the genome20, and we analysed nucleotide sequence polymorphism (including only invariant and biallelic SNPs), based on the A. lyrata reference genome 21. We define a haplotype as a unique combination of mutations along the phased chromosome, and a S-allele lineage as the collection of gene copies of a given functional S-allele (different functional S-alleles are distinguished based on their strong sequence divergence at the S-locus pollen and pistil genes). Different gene copies within an S-allele lineage can thus be associated with distinct linked haplotypes in the flanking regions. The S-alleles were identified based on short reads sequences according to a previously published method 41. We analysed two closely related A. halleri populations from Europe (Nivelle and Mortagne) and three allogamous A. lyrata populations from North America (IND, PIN and TSS 22). Overall, we were able to reconstruct 34 haplotypes linked to a total of 12 distinct S-allele lineages in Nivelle, 38 haplotypes linked to 11 distinct S-allele lineages in Mortagne and 16, 22 and 16 haplotypes associated with 6, 7 and 5 distinct S-allele lineages in populations IND, PIN and TSS, respectively (Table S5). Nine of the S-alleles were shared between the two A. halleri populations (Ah01, Ah03, Ah04, Ah05, Ah12, Ah20, Ah24, Ah25 and Ah59). In the populations of A. lyrata, four S-alleles were shared between PIN and TSS (Ah01*, Ah03*, Ah18* and Ah63*), five S-alleles were shared between PIN and IND (Ah01*, Ah03*, Ah46* and Ah63*), four S-alleles were shared between IND and TSS (Ah01*, Ah03*, Ah31* and Ah63*), and three were shared across all three (Ah01*, Ah03* and Ah63*). Note that for convenience, we used A. halleri notations (with the addition of a *) to refer to the trans-specifically shared A. lyrata S-alleles. Altogether, we were able to obtain the phased flanking sequences of 126 S-locus haplotypes, comprising a total of 4,854 variable sites. This provides considerable power to evaluate the local accumulation of linked mutations across S-alleles of different levels of dominance and to examine their patterns of conservation between populations and between species.
Mutations in the S-locus flanking regions can be exchanged between S-alleles by recombination, and between local populations by migration 23. The relative time scale of these two processes (recombination vs. migration) determines the distribution of the linked mutations. To capture the chromosomal extent of this effect of linkage to S-alleles, we developed a new phylogenetic method comparing the likelihood of two contrasted topologies of interest in overlapping windows along the chromosome: (1) the topology clustering haplotypes by the populations where they came from vs. (2) the topology clustering them by the S-allele to which they are linked (Fig 2A). This allowed us to evaluate the progressive shift from a predominant topology by S-alleles close to the S-locus to a topology by populations further along the chromosome and in unlinked control regions (Fig. 2B). The difference in log likelihood between the two topologies decreased significantly with distance to the S-locus (Pearson coefficient = -0.015 and -0.010 for A. halleri and A. lyrata respectively; p-values <2e-16). In A. halleri, the topology grouping haplotypes by populations became more likely than the topology grouping them by S-alleles at a distance of around 30kb from the S-locus, but even at a distance of 50kb the phylogenetic structure was still different from that in regions unlinked to the S-locus used as controls for the genomic background 20 (Fig 2B). In A. lyrata, the shift was even more rapid (within 10-15kb), although we note that the phylogenetic structure of the control regions was less resolved (Fig 2B). To evaluate these patterns more directly we first examined the data using a Major Component Analysis (MCA, a modified version of PCA adapted to binary data, Fig S1 and S2) and using simple phylogenetic reconstructions (Fig S3, S4, S5 and S6). We confirmed that haplotypes linked to a given S-allele tended to cluster together in the most tightly linked region, and that this grouping by S-alleles was progressively lost in favour of a grouping by population of origin in the most distant regions. Following Kamau et al. 24, we compared the fixation index FST among local populations and among S-alleles in A. lyrata and A. halleri. In both A. halleri and A. lyrata, FST values among S-alleles were high in regions close to the S-locus and quickly decreased to reach the background level (Fig. S7) as the distance from the S-locus increased. In parallel, the differentiation among populations followed roughly the opposite pattern, i.e. it was initially low in regions close to the S-locus (as expected under strong balancing selection) and increased up to background level within the first few kilobases (Fig. S7). In line with our phylogenetic analysis, differentiation between populations started to exceed differentiation between S-alleles much closer to the S-locus in the A. lyrata than in the A. halleri populations (Fig S3, S4, S5 and S6). Finally, we explored the fine-scale patterns of association within populations between individual S-alleles and SNP in the linked and the control regions (Fig S8). As expected, the vast majority of significant associations were found for the most closely linked SNPs. With a single exception, all S-alleles were associated with unique SNPs in the 50kb region around the S-locus, albeit with substantial heterogeneity among S-alleles in the patterns and extent of associations that they show (Fig S8). Overall, our results indicate that due to limited recombination, the S-alleles carry a specific set of polymorphic sites in the linked region. This association fades away for more distant sites over a few kilobases, where population structure becomes predominant, as in the rest of the genome. Hence, different S-alleles are associated with specific sets of tightly linked mutations, but only within 10-30kb.
No overall evidence that dominant S-alleles accumulate more linked deleterious mutations
Llaurens et al. 13 predicted a positive correlation between the dominance of S-alleles and their tendency to fix linked deleterious mutations. Thus, we investigated the correlation between the level of dominance of the S-alleles and their total number of 0-fold degenerate mutations (S 0f) or the ratio of 0-fold to 4-fold mutations (S0f/S4f) for the phased haplotypes, assuming that the vast majority of 0-fold degenerate mutations are deleterious. Based on the results presented above and the results of our previous study 20, for the rest of our analyses we focused on the phased haplotypes over the first 25 kb on either side of the S-locus. We found no overall effect of dominance on S 0f (p-values= 0.54 and 0.07 for A. halleri and A. lyrata respectively; Fig. 3; Table S6) or S 0f/S4f (p-values= 0.54 and 0.07 for A. halleri and A. lyrata respectively; Table S6). Extending the analysis to all non-synonymous mutations led to identical conclusions (Table S6). Overall, our genomic results did not confirm the prediction that dominant S-alleles accumulate a larger number of putatively deleterious mutations in their linked regions. We note that the particular S-allele whose sheltered load was quantified in Llaurens et al. 13 (Ah15, red arrow on Fig 3A) appears to be one of the S-alleles associated with the highest number of 0-fold degenerate mutations among all S-alleles of the most dominant class (class IV).
The structure of the linked genetic load differs between dominant and recessive S-alleles
Theory predicts that dominant S-alleles should fix linked recessive deleterious mutations with a higher probability than recessive S-alleles13, but in natural populations we observed no difference in the total number of putatively deleterious linked to dominant vs. recessive S-alleles. To clarify this discrepancy, we took advantage of our sequencing of multiple copies of S-alleles to consider separately the fixed and the segregating mutations linked to each of the S-allele lineages. For each population, we included only mutations that were segregating, and excluded those that were locally fixed. In agreement with the prediction of Llaurens et al.13, we observed that lineages of dominant S-alleles do indeed tend to fix deleterious mutations more readily (Fig. 4). The fact that they do not accumulate a larger total number of deleterious mutations is explained by the fact that the structure of the genetic load differs between dominant and recessive S-alleles: the dominant S-alleles tend to have more fixed deleterious mutations, but the recessive S-alleles compensate by accumulating a larger number of segregating mutations, resulting in similar numbers of deleterious mutations overall in most of the populations.
Motivated by these empirical observations, we extended the model proposed by Llaurens et al. 13 to predict the dynamics of accumulation of recessive deleterious mutations linked to S-alleles, focusing not only on fixed deleterious mutations but also on those that are segregating within allelic lineages. These stochastic simulations confirmed that, at equilibrium, dominant S-alleles tend to accumulate a larger number of deleterious mutations that are fixed among gene copies within S-allele lineages (Fig. 5A). In contrast, the number of segregating linked mutations was higher for recessive than for dominant S-alleles (Fig. 5B). Our model predicted that these two effects eventually compensate each other, such that in the end the total number of linked deleterious mutations was not expected to change with dominance of the S-alleles (Fig. 5C). These predictions are in line with our genomic observations and suggest that the dominance level of S-alleles modifies the structure of the genetic load they shelter: dominant S-alleles accumulate more fixed deleterious mutations, but recessive S-alleles accumulate more segregating mutations, resulting in an equivalent total load.
Discussion
The genetic load linked to the S-locus is detectable and is manifested on different phenotypes
Our results contribute to a still restricted but growing body of evidence confirming that the accumulation of deleterious mutations linked to strongly balanced allelic lines can be substantial, and that their effect can be detected at the phenotypic level 13,19,25–28. An interesting observation is that the phenotypes on which the load was revealed varied among these studies. Here, the effect of homozygosity at the S-locus was apparent on juvenile survival and on the length of the longest flowering stem, but we detected no effect on any other morphological measurements, including leaf and rosette traits. In the same population of A. halleri, Llaurens et al.13 detected an effect on juvenile survival and on leaf size. A study in North American outcrossing populations of A. lyrata19 detected an effect on juvenile survival, but not on any other traits that they measured. In the horsenettle Solanum carolinense, the load was associated with reduced seed viability, flower number and germination25,28. Hence, the most consistent pattern seems to be a decrease of overall juvenile survival,possiblybecauseitisahighlyintegrativemeasurementoffitness,whereasother morphological or life history traits can be associated with more specific components of overall fitness.
A unique genetic load associated with each allele in each population
The model of the sheltered load posits that each S-allele should be associated with a specific set of linked mutations13. In line with this prediction, another consistent observation is that the magnitude of the S -linked load varied among S-alleles, as the load linked to some S-alleles was phenotypically detectable, while for others it was not. This variation of the genetic load is expected since deleterious mutations associated with the different alleles are likely to hit different linked genes, and affect different phenotypic traits with different effects on fitness. Also in line with the model of the sheltered load, our phasing of a large number of variants linked to S-haplotypes in several natural populations revealed that the same suite of linked mutations was consistently associated among different copies of a given allele when sampled from within the same population, in particular for the dominant S-allele lineages under more intense balancing selection. As expected for outcrossing populations with short-scale linkage disequilibrium, this association was lost when examining sites at increasing genetic distances from the S-locus along the chromosome (see also Le Veve et al.20). However, a proper model of sheltered load taking into account recombination among S-alleles is still missing. Finally, the association with linked sites was further lost when comparing gene copies of S-alleles sampled from different local populations, suggesting that recombination within populations decouples alleles from their linked sites faster than migration can homogenise the genetic composition among these natural populations. We note that the patterns of association and phylogenetic structure differed among populations, possibly due to their contrasted demographic histories. Indeed, the A. lyrata populations colonised North America from ancestral European populations about 20-30.000 years ago 29,30, and are less diverse overall than the A. halleri populations we studied, who colonised the north of France during the last century from ancestral German populations31. The progressive decoupling between alleles and their linked sites leads to the simple prediction that S-locus homozygous genotypes formed by crossing individuals carrying identical alleles from distinct populations should not reveal as much load as when they are formed by crossing individuals within populations. Hence, the S-locus region could contribute to overall hybrid vigour. Testing this prediction will be an interesting next step.
Different properties of the linked load according to S-allele dominance
The question of whether S-allele dominance, which modifies the intensity of balancing selection in the S-locus region, could explain variation of the linked load has received conflicting support in the literature. In line with Stift et al. 19, but in contradiction with Llaurens et al. 13, we observed no overall effect of dominance on the magnitude of the load. Several technical and biological reasons could explain the contrasted results obtained in these different studies. First, phenotypic quantification of the linked load is experimentally demanding, such that these studies relied on the comparison of a limited number of alleles (three S-alleles in each of the studies) and therefore each of them had inherently low power. Second, the experimental procedures to reveal the load varied slightly. Llaurens et al.13 used CO2 treatment to by-pass the SI system and obtain homozygous progenies from crosses that would otherwise have been incompatible, whereas we used the “natural” masking by dominant S-alleles to enable the obtention of recessive homozygous genotypes. Our approach is experimentally simpler and avoids the possible contamination by offspring obtained by selfing, which may combine the effect of the sheltered load with that of genome-wide inbreeding depression (see Stift et al. 19 for a detailed discussion of this caveat). Third, a limitation of our approach is that it is restricted to S-alleles that are recessive or intermediate along the dominance hierarchy, and is thus not applicable to quantify the load associated with the most dominantS-alleles under more intense balancing selection. It is therefore possible that the S-alleles we examined did not exhibit sufficiently contrasted levels of dominance, in particular if only the most dominant ones are generating a substantial load, as suggested for fully linked deleterious mutations. In addition, since the homozygous S-allele genotypes we create correspond to different gene copies from the population, they may carry distinct sets of linked deleterious variants, especially for the more recessive alleles. The variation we observed in the phenotypic magnitude of the load among families confirms that linked deleterious variants are unlikely to be fixed within all allele lineages. Finally, we note that our genomic analysis of the genetic load shows that the dominant allele Ah15 previously associated with reduced fitness in homozygotes 13,isindeedunusualintermsofthenumberof mutations it carries. In fact, it is one of the most “loaded” alleles among all the dominant S-alleles present in this population, possibly explaining why Llaurens et al. 13 observed a significant effect despite the inherently limited experimental power of their analysis.
Our stochastic simulations and genomic analyses concur to the conclusion that the intensity of balancing selection applied to each allele affects the genetic architecture of the linked load: a larger proportion of putatively deleterious mutations are fixed among gene copies of the dominant as compared to the recessive S-alleles, while gene copies of the recessive S-alleles tend to accumulate more segregating deleterious variation. While these two processes eventually compensate one another, they may have distinct consequences for the evolution of S-alleles. Uyenoyama 12 showed that the existence of a sheltered load should influence the evolutionary dynamics of new S-alleles through self-compatible intermediates. Specifically, antagonistic interactions are expected between ancestral and derived functional specificities because they would initially share their linked deleterious mutations, slowing down the establishment of new alleles. Our observation that partially different sets of linked mutations are associated with S-alleles from the different populations raises the question of whether the (short) time scale at which recombination decouples alleles from their sets of linked mutation is sufficiently fast to impede such antagonistic interactions to take place. In other words, the effect of the load on the diversification dynamics should be most important if the two mutational steps required for the emergence of new alleles under this model take place within local populations, rather than involving a metapopulation-scale process. As shown by Stetsenko et al.32, this is expected to occur under very low dispersal only. In addition, the observation that the architecture of the sheltered load differs between dominant and recessive S-alleles suggests that their diversification dynamics may also differ. Specifically, the self-compatible intermediates required for the formation of new S-alleles33,34 are expected to be capable of selfing as well as forming homozygous genotypes that would otherwise be prevented. While the consequences of selfing may be equivalent for all alleles (because the overall number of mutations to which they are linked are equivalent), the consequences of the formation of homozygotes allowed by the crossing of separate individuals sharing a given S-allele are expected to be more severe for dominant S-alleles. The segregation of distinct deleterious variants linked to different gene copies of recessive S-alleles implies that linked recessive deleterious mutations are likely to remain masked when two distinct gene copies of a given recessive S-allele are brought together. Hence, our results lead to the prediction that in natural populations self-compatible mutants may segregate more readily for the more recessive than for the more dominant S-alleles, and more generally for allelic lineages under lower intensity of balancing selection. Considering that self-compatible mutants are a necessary intermediate stage in the formation of new S-alleles, one may predict that the diversification dynamics should be more efficient for lineages of recessive than dominant S-alleles. This prediction is in line with the deeper phylogenetic divergence among the most dominant S -alleles observed in Arabidopsis16. Detailed quantification of the presence of self-compatible variants in natural populations will now be necessary to test this hypothesis. At this stage, however, a proper model of allelic diversification taking into account dominance interactions among S-alleles is still missing.
Our observation that the genetic load varies across balanced allelic lines is not unprecedented. The classical case of Y or W sex chromosomes are indeed examples where one balanced line accumulates a greater genetic load than the other (X or Z, respectively), eventually leading to substantial genetic degeneration35–37. Another example is the supergene controlling variation in male plumage phenotypes of the ruff, where the genetic load on the derived “Satellite” haplotype is higher than on the ancestral “Independent” haplotype3,38. Similarly, in the butterfly Heliconius numata, the inverted haplotypes conferring mimetic wing patterns tend to accumulate a greater load than the non-inverted haplotypes39. Interestingly, in all these cases, the haplotypes with the greatest load also act genetically in a dominant manner, establishing a clear parallel with our observations. An interesting next step will be to determine whether similar asymmetries between the balanced allelic lines are observed in the linked region for these other systems.
It is clear from our results that S-allele dominance can affect the linked load, through its effect on the intensity of balancing selection, but in turn the differences in structure of the linked load may affect the conditions under which dominance can evolve. The Brassicaceae S-locus is indeed an interesting and rather unique system, where the mechanisms by which dominance is controlled and evolves under the action of so-called “dominance modifiers” have been studied in detail 39,40 (involving the interaction between small non-coding RNAs and their target sites 16, 40). The presence of deleterious mutations linked to S-alleles has been shown to affect the evolution of dominance modifiers, favouring evolution towards greater dominance than towards greater recessivity 40. This asymmetry arises from the fact that S-alleles that become recessive (e.g. following acquisition of a recessivity modifier such as a small RNA target) will start forming homozygous genotypes, leading to expression of their linked load, while S-alleles that become dominant will not. Our observation that many deleterious mutations linked to recessive S-alleles are indeed segregating, suggests that expression of the load will be less severe for recessive than for dominant S-alleles, hence decreasing this predicted asymmetry. It will now be essential to modify models for the evolution of dominance to allow for such differential load among S-alleles.
Acknowledgements
This work was funded by the European Research Council (NOVEL project, grant #648321) and ANR TE-MoMa (grant ANR-18-CE02-0020-01). AL’s PhD thesis was funded by the ERC and the University of Lille. The authors thank Barbara Mable for sharing seeds of A. lyrata andCamilleRouxfor discussions. This work was performed using infrastructure and technical support of the Plateforme Serre, cultures et terrains expérimentaux - Université de Lille for the greenhouse/field facilities. The authors thank the UMR 8199 LIGAN-MP Genomics platform (Lille, France) which belongs to the ‘Federation de Recherche’ 3508 Labex EGID (European Genomics Institute for Diabetes; ANR-10-LABX-46) and was supported by the ANR Equipex 2010 session (ANR-10-EQPX-07-01; ‘LIGAN-MP’). The LIGAN-MP Genomics platform (Lille, France) is also supported by the FEDER and the Region des Hauts-de-France. The authors thank the GenoScreen platform (Lille, France).
Declaration of interests
The authors declare no competing interests
Methods STAR
Source plant material
We worked on natural accessions from two closely related species, A. halleri and A. lyrata, represented by two population samples named Mortagne (50°47’N, 3°47’E, France, n=60) and Nivelle (50°47’N, 3°47’E, France, n=61) for A. halleri, and three highly outcrossing population samples from the North American Great Lakes, named IND (Indiana Dunes National Lakeshore in Michigan, n=9), PIN (Pinery Provincial Park in Ontario, n=11) and TSS (Tobermory Provincial Park in Ontario, n=8)22 for A. lyrata (Fig. S9). The A. lyrata populations colonised North America from ancestral European populations about 20-30.000 years ago 29,30 and the A. halleri populations are peripheral and likely colonised the north of France during the last century from ancestral German populations31.
We performed 92, 91, 40, 43 and 21 controlled crosses between randomly chosen individuals within the Nivelle, Mortagne, IND, PIN and TSS populations, respectively. We successfully obtained seeds from 60, 66, 21, 21 and 10 of these crosses, respectively. Because we were not interested in estimating population frequencies of S-alleles, we instead tried to maximise the number of reconstructed haplotypes and avoid over representing the most recessive S-allele (Ah01) that tends to segregate at very high frequencies in natural populations 15. To do this, we performed PCR with S-allele-specific primers 15,18 to screen the parents of the crosses and we removed from the experiment offspring with two parents carrying allele Ah01. For A. halleri, we selected 19 individuals from the Nivelle population and 19 individuals from the Mortagne population, based on their genotype at the S-locus (Fig. S9; Table S7). We also selected one offspring of 9, 11, 5, 6 and 5 pairs of selected individuals in the Nivelle, Mortagne, IND, PIN and TSS populations respectively for the phasing of S-haplotypes (Table S8; Fig. S9). To increase sample size for the phenotypic measurements, we included offspring from five additional crosses from the Nivelle population (Table S8).
Library preparation, capture and sequencing
We used a previously developed sequence capture approach to specifically sequence genomic regions of interest 20. Briefly, indexed genomic libraries were constructed for each individual and libraries were pooled in equimolar proportions. Fragments matching a series of regions of interest (including in particular the 75kb upstream and downstream of the non-recombining S-locus region as well as a series of 100 unlinked 25kb regions used as genomic controls 20), were then enriched using synthetic 120bp RNA probes and sequenced by Illumina MiSeq (a total of 159 million paired-end reads).
For six individuals (Table S7, S8), we completed the sequencing with genome-wide resequencing (WGS) in order to distinguish the homozygous and heterozygous genotypes at the S-locus based on read depth41, which is not possible using data from the capture protocol. The prepared libraries were sequenced by Illumina NovaSeq (2x 150pb, paired-end) from the GenoScreen platform (Lille, France).
Determination of the S-locus genotypes and dominance of S-alleles
We used a dedicated pipeline for genotyping the S-locus based on short reads sequencing41 obtained from each individual (Table S7 and S8). The level of dominance of S-alleles found in our study was determined based on either previous assessment of dominance in A. lyrata and A. halleri15,19,42–44 or indirectly inferred based on the observed association between the phylogeny of S-alleles and levels of dominance45.
Read mapping and variant calling in A. halleri and A. lyrata populations
Raw reads were mapped on the complete A. lyrata reference genome V1.0.23 21 using Bowtie2 v2.4.146, as described in Le Veve et al 20. File formats were then converted to BAM using samtools v1.3.147 and duplicated reads were removed with the MarkDuplicates program of picard-tools v1.119 (http://broadinstitute.github.io/picard). These steps were performed by the custom Pyt hon script sequencing_genome_vcf.py available at https://github.com/leveveaudrey/analysis-of-polymorphism-S-locus.
We obtained an average of 620 million properly mapped paired-end 300bp reads per population sample. For consistency, we conserved only reads which mapped to the S-locus flanking or control regions, even for samples sequenced by WGS, using the targetintercept option of bedtool v2.25.0 48. We called all SNPs within the chromosomal segment comprising 50 kb upstream from the first base of the gene Ubox in 3’ and 50 kb downstream from the last base of the gene ARK3 in 5’ of the S-locus using the Genome Analysis Toolkit v. 3.8 (GATK)49 with the option GVCF and a quality score threshold of 60 using vcftool v0.1.15 50. This region contains 20 annotated protein-coding genes. In this study we excluded the genes inside the S-locus itself (SCR, SRK). For each sample independently, we computed the distribution of coverage depth across control regions using samtools depth 47. We excluded sites with either less than 15 reads aligned or coverage depth above the 97.5 % percentile, asthelatterarelikelytocorrespondtorepeatedsequences(e.g.transposableelementsor paralogs). Finally, we removed SNPs fixed in each population using the script 1_fix_pos_vcf.py (https://github.com/leveveaudrey/dominance_and_sheltered_load), thus retaining only nucleotide sites that were variable in the population.
Quantifying the sheltered load of deleterious mutations
We examined the genetic load signatures based on the accumulation of mutations on 0-fold degenerate sites, the vast majority of which are considered deleterious. The 0-fold and 4-fold degenerate sites were identified and extracted from the reference genome and the gene annotation using the script NewAnnotateRef.py 51. We also examined the genetic load signatures based on the accumulation of all non-synonymous mutations. We obtained a total of 2,441 and 2,435 variable positions for the A. halleri samples from Nivelle and Mortagne respectively, and 2,360 variable positions for the A. lyrata samples.
Phasing S-haplotypes
For each of the 9, 11, 5, 6 and 5 trios analysed in the Nivelle, Mortagne, IND, PIN and TSS populations respectively, we phased mutations in the flanking regions, resulting in 130 phased haplotypes (Fig. S9). Briefly, we used sites that were heterozygous in the offspring to resolve parental haplotypes by assuming no recombination between parent and offspring, thus attributing the allelic state that was shared between a parent and its offspring to their shared S-allele, and the allelic state that was not shared to the other (untransmitted) haplotype of the parent. Twelve of the parents had been used in more than one cross, and in these cases we phased their haplotypes only once (Table S8). We implemented the phasing procedure in the script 3_phase_S_allele.py available at https://github.com/leveveaudrey/dominance_and_sheltered_load.
Study of the structure of S-haplotypes
We first developed a new method to evaluate the distortion of the phylogenetic patterns caused by linkage to S-alleles. To do this, we used phyml 52 v.3.3 to calculate the likelihood of two contrasted topologies of interest: (1) the topology clustering haplotypes by the populations where they came from vs. (2) the topology clustering them by the S-allele to which they are linked (Fig 2A). We used sliding windows of sequences with 50 SNPs to obtain the variation of the difference in log-likelih ood between these two topologies along the chromosome. We then compared these values to their distribution throughout the genome obtained by random draws of sequences with 50 SNPs from the control regions. Second, we visualised the relationships among the phased haplotypes using maximum likelihood phylogenies based on the Tamura-Nei model 53, with 1,000 replicates in MEGA X54. Third, we followed Kamau et al.’s 24 approach and examined the variation of FST among populations within each species (Nivelle and Mortagne for A. halleri and IND, PIN and TSS for A. lyrata) along the flanking region in non-overlapping windows of 5kb. We also examined the variation of FST along the flanking region obtained by grouping haplotypes by their linked S-allele rather than by population of origin. Then, we compared these FST values computed in the S-locus flanking regions with their genomic distribution as determined from the 100 control regions. The FST values were estimated with the DNAsp 6 software 55. Fourth, we performed a major component analysis (MCA) based on SNPs in the first 5kb, SNPs between 5 and 25kb and SNPs between 25 and 50kb around the S-locus, using the R packages ‘ggplot2’ (version 3.4.0), ‘factoextra’ (version 1.0.7) and ‘FactoMiner’ (version 2.7). We compared the patterns obtained by these MCAs with those obtained from identical numbers of SNP (+/- 1%) from the control regions. Finally, we analysed genetic association in each population independently between each of the locally segregating variants and the S-alleles considered as phenotypes, using STRAT 56 V1.1 combined with Structure 57 V2.3. We examined the distribution of the top 0.1% most significant associations detected specifically for each S-allele in each population.
Estimation of the number of fixed and segregating deleterious mutations within S-allele lineages
For each variable position considered in the phased haplotypes, we estimated the number of mutations on 0-fold (S 0f) and 4-fold degenerate sites (S 4f) compared with the reference genome. We distinguished SNPs that were fixed from those that were segregating within each of the allelic lines. We used GLM with a Poisson distribution to test whether the number of fixed and segregating mutations were associated with S-allele dominance, considering populations as random effects. We reiterated the GLM analysis with the number of non-synonymous (S NS) and synonymous (S S) mutations.
Estimation of the phenotypic impact of homozygosity at the S-locus for three S-alleles
To determine if the genetic sheltered load putatively linked to the S-locus has a detectable phenotypic impact, we performed 45 crosses (Table S1) between offspring of the Nivelle individuals that we chose so that they shared one S-allele (Fig. S9). Based on the dominance hierarchy in pollen16 (Table S1), these crosses should correspond to compatible partners. The general principle of the experiment was to take advantage of the dominance hierarchy to mask recessive S-alleles and generate full sibs that were either homozygous (because they inherited the S-allele that was shared by their two parents) or heterozygous at the S-locus, and thus isolate the effect of homozygosity at the S-locus. Note that all offspring in our experiments were thus ”naturally” outcrossed, whereas Llaurens et al. 13 based their comparisons on outcrossed progenies obtained by enforced incompatible crosses and Stift et al. 19 based their comparisons on enforced selfed progenies. These crosses generated 399 seeds overall, with homozygous genotypes expected for the S-alleles Ah01, Ah03 and Ah04 forming the following dominance relationship: Ah01<Ah03<Ah04.
Seedlings were grown in a greenhouse between 14.5 and 23.1°C and a photoperiod of 16 hr day/8 hr night. Offspring from the six families were placed on tables, and their position randomised every three days. After three months of growing, all the germinated plants were vernalised under a temperature between 6 and 8°C and a natural photoperiod for two months (January-February). Then, all surviving plants began reproduction in a greenhouse under temperature between 10.6 and 25.3°C and a natural photoperiod. The genotypes at the S-locus were determined in surviving plants by a PCR approach, using S-allele-specific primers for the pistil-expressed SRK gene. We assessed the reproductive success of offspring from the different crosses on the basis of fourteen phenotypic traits (detailed below) and computed the mean difference for the trait between homozygotes and heterozygotes within each family. We also tested for departures from mendelian proportions of each S-locus genotypic category in the family after the apparition of the first stem. Significant departures were interpreted as reflecting differences in survival between homozygous and heterozygous S-locus genotypes. We performed 10,000 replicate simulations of mendelian segregation based on the S-locus genotype of the parents. We used GLM to test whether the phenotypic impact of homozygosity at the S-locus increased with dominance of the S-alleles. The models used for GLM depended on the type of trait analysed (poisson for the counts like the number of leaves, flowers by stems or days; gaussian for continue traits like the lengths, widths and areas).
We measured the following fourteen phenotypic traits: the time (days) to the first leaf measured by visual control every day during seven weeks after sowing the seeds, the number of leaves, the area of the rosette (cm²), the mean length and width of leaves (cm), the standard deviation of length and width of leaves (cm) and the mean area of leaves (cm 2) measured by ImageJ58 based on photographs taken seven weeks (+/- five days) after the first leaf. At reproduction, we measured the time to the first flower bud for the end of vernalisation (day), scored by visual control every three days during nine weeks, the number of flower buds per flower stem produced during four week after the appearance of the first bud, the number of flower stems, the length of the highest flower stem produced four weeks after the appearance of the first bud (cm), and finally the total duration of buds production (days), scored by visual control every three days during eleven weeks after the appearance of the first bud. The last trait we measured was the proportion of homozygotes per family that survived until reproduction assuming mendelian proportions in the seeds. During the whole experiment, the presence of phytophagous insects, pathogens and stress markers were scored as binary variables. The presence of phytophagous insects and pathogen attacks were detected by the occurrence of gaps in leaves. Oxidative stress was scored qualitatively based on the occurrence of purple leaves. We also controlled the effect of the family on the phenotypic trait. These effects were controlled by redistributing 10,000 times the values observed in groups of the same size observed for each effect (for example, presence or absence of pathogen attack) and comparing the difference for the trait observed with the distribution of the differences obtained in the permutations. We considered the impact of the effect on the trait if the observed difference between groups was higher than the 95% percentile of the distribution obtained randomly (Table S9). When the test was significant, the effect was implemented as a random effect in the GLM. We used the same method to control for the family effect, which was included as a random effect in GLM if necessary (Table S10). The general experimental procedure is summarised in Fig. S9 and all data analyses were done in R ver. 3.1.2 (R Development Core Team 2014).
Simulations
Finally, we refined the model of Llaurens et al., 13 in several ways. We simulated a panmictic population of N diploid individuals with non-overlapping generations. Each individual was defined by its genotype in a non-recombining genomic region. This region contains the S-locus, and a D locus where deleterious mutations accumulated. For the S-locus, we used a simple model of sporophytic SI, with four dominance classes, as observed in A. halleri 41 (only three classes were considered before13), and fourteen S-alleles (eight alleles in the class IV, three in the class III, two in the class II and one allele in the class I). This distribution mirrors that of the Nivelle population (Table S7), with the exception that a class II allele has been added because its presence has been reported in previous studies 15. Alleles within classes were assumed to be codominant with each other, and dominant over all alleles of the more recessive classes, with the following linear hierarchy between classes : classI<classII<classIII<classIV). We also assumed that no new S-allelecouldappearby mutation during the simulations. The population size was 10,000 diploid individuals, so as to be large enough to avoid S-allele loss by drift during the simulations (previously it was 1,000). The “D locus” comprised one hundred fully linked biallelic positions (versus a single one in Llaurens et al. 13). Fully recessive deleterious mutations were recurrently introduced (at a rate 10 -4), and reverse mutations were possible (at a rate 10 -5). We ignored partially recessive deleterious mutations because these mutations were predicted to be effectively eliminated by natural selection in Llaurens et al. 13. The survival probability p of a zygote depended on its genotype at the D locus: p = (1 = s)n with s the selection coefficient and n the number of positions homozygous for the mutated allele. We explored different values of the selection coefficient (0.1, 0.05, 0.03, 0.01 and 0.005). Under strong selection (s=0.1, 0.05 and 0.03), the combined effect of multiple mutations led to low-fitness individuals, eventually causing population extinction. Under weak selection, (s=0.005), we observed near fixation of the deleterious mutations under the influence of asymmetrical mutation. Hence, we focused on the intermediate value of the selection coefficient (s=0.01), where deleterious mutations segregated stably in the simulations.
We first ran simulations without deleterious mutations until a deterministic equilibrium for S-allele frequencies was reached, which was considered to be attained when allelic frequencies changed by less than 10 -3 between generations. Recessive deleterious mutations were then allowed to accumulate at the positions within the D locus. Each simulation was performed with 100 independent replicates of 100,000 generations, and the frequency of the deleterious alleles was recorded every 1,000 generations. At the end of the simulation runs, we estimated the number of deleterious mutations found in each haplotype associated with each S-allele to determine the expected patterns of association between the sheltered load and dominance at the S-locus.
Data Availability
All sequence data are available in the NCBI Short Read Archive (SRA; https://www.ncbi.nlm.nih.gov/sra) with accession codes: PRJNA744343, PRJNA755829.
All scripts developed are available in Github (https://github.com/leveveaudrey/dominance_and_sheltered_load https://github.com/leveveaudrey/analysis-of-polymorphism-S-locus).
Supplementary data
Figures legend
Supplemental items
References
- 1.Evolution of genomic structures on mammalian sex chromosomesCurrent Genomics 13:115–123https://doi.org/10.2174/138920212799860625
- 2.Recombination suppression and evolutionary strata around mating-type loci in fungi: documenting patterns and understanding evolutionary and mechanistic causesNew Phytologist 229:2470–2491https://doi.org/10.1111/nph.17039
- 3.Structural genomic changes underlie alternative reproductive strategies in the ruff (Philomachus pugnax)Nat Genet 48:84–88https://doi.org/10.1038/ng.3430
- 4.Primula vulgaris (primrose) genome assembly, annotation and gene expression, with comparative genomics on the heterostyly supergeneSci Rep 8https://doi.org/10.1038/s41598-018-36304-4
- 5.Genealogical structure among alleles regulating self-incompatibility in natural populations of flowering plantsGenetics 147:1389–1400https://doi.org/10.1093/genetics/147.3.1389
- 6.Evolution under tight linkage to mating typeNew Phytol 165:63–70https://doi.org/10.1111/j.1469-8137.2004.01246.x
- 7.Mutation load at a mimicry supergene sheds new light on the evolution of inversion polymorphismsNat Genet 53:288–293https://doi.org/10.1038/s41588-020-00771-1
- 8.Sheltering of deleterious mutations explains the stepwise extension of recombination suppression on sex chromosomes and other supergenesPLOS Biology 20https://doi.org/10.1371/journal.pbio.3001698
- 9.Incompatibility and incongruity in wild and cultivated plantsSpringer-Verlag, Berlin Heidelberg
- 10.The male determinant of self-incompatibility in BrassicaScience 286:1697–1700https://doi.org/10.1126/science.286.5445.1697
- 11.Self-incompatibility in the genus Arabidopsis: characterization of the S locus in the outcrossing A. lyrata and its autogamous relative A. thalianaPlant Cell 13:627–643
- 12.Genealogy-dependent variation in viability among self-incompatibility genotypesTheoretical Population Biology 63:281–293https://doi.org/10.1016/S0040-5809(03)00020-0
- 13.The sheltered genetic load linked to the S locus in plants: new insights from theoretical and empirical approaches in sporophytic self-incompatibilityGenetics 183:1105–1118https://doi.org/10.1534/genetics.109.102707
- 14.Plant self-incompatibility in natural populations: a critical assessment of recent theoretical and empirical advancesMolecular Ecology 13:2873–2889https://doi.org/10.1111/j.1365-294X.2004.02267.x
- 15.Does frequency-dependent selection with complex dominance interactions accurately predict allelic frequencies at the self-incompatibility locus in Arabidopsis halleriTEvolution 62:2545–2557https://doi.org/10.1111/j.1558-5646.2008.00469.x
- 16.Dominance hierarchy arising from the evolution of a complex small RNA regulatory networkScience 346:1200–1205https://doi.org/10.1111/eva.12933
- 17.A general model to explore complex dominance patterns in plant sporophytic self-incompatibility systemsGenetics 175:1351–1369https://doi.org/10.1016/j.tig.2011.06.005
- 18.Contrasted patterns of molecular evolution in dominant and recessive self-incompatibility haplotypes in ArabidopsisPLoS Genetics 8https://doi.org/10.1371/journal.pgen.1002495
- 19.Inbreeding depression in self-incompatible North-American Arabidopsis lyrata : disentangling genomic and S-locus-specific genetic loadHeredity 110:19–28https://doi.org/10.1038/hdy.2012.49
- 20.Long-term balancing selection and the genetic load linked to the self-incompatibility locus in Arabidopsis halleri and A. lyrataMolecular Biology and Evolution msad 120https://doi.org/10.1093/molbev/msad120
- 21.The Arabidopsis lyrata genome sequence and the basis of rapid genome size changeNat Genet 43:476–481https://doi.org/10.1038/ng.807
- 22.Reconstructing origins of loss of self-incompatibility and selfing in North american Arabidopsis lyrata : a population genetic contextEvolution 64:3495–3510https://doi.org/10.1111/j.1558-5646.2010.01094.x
- 23.Balancing selection and its effects on sequences in nearby genome regionsPLOS Genetics 2https://doi.org/10.1371/journal.pgen.0020064
- 24.Linkage Disequilibrium and Recombination Rate Estimates in the Self-Incompatibility Region of Arabidopsis lyrataGenetics 176:2357–2369https://doi.org/10.1534/genetics.107.072231
- 25.Sheltered load associated with S-alleles in Solanum carolinenseHeredity 92:335–342https://doi.org/10.1038/sj.hdy.6800425
- 26.The population genetics of the self-incompatibility polymorphism in Papaver rhoeasX. An association between incompatibility genotype and seed dormancy. Heredity 75:92–97https://doi.org/10.1038/hdy.1995.108
- 27.The identification of the Rosa S-locus and implications on the evolution of the Rosaceae gametophytic self-incompatibility systemsSci Rep 11https://doi.org/10.1038/s41598-021-83243-8
- 28.The effect of sheltered load on reproduction in Solanum carolinense, a species with variable self-incompatibilitySex. Plant Reprod 22:63–67https://doi.org/10.1007/s00497-008-0092-x
- 29.Population genetic structure of Arabidopsis lyrata in EuropeMolecular Ecology 15:2753–2766https://doi.org/10.1111/j.1365-294X.2006.02973.x
- 30.Patterns of polymorphism and demographic history in natural populations of Arabidopsis lyrataPLOS ONE 3https://doi.org/10.1371/journal.pone.0002411
- 31.Multiple origin of metallicolous populations of the pseudometallophyte Arabidopsis halleri (Brassicaceae) in central Europe: the cpDNA testimonyMolecular Ecology 14:4403–4414https://doi.org/10.1111/j.1365-294X.2005.02739.x
- 32.Balancing selection and the crossing of fitness valleys in structured populations: diversification in the gametophytic self-incompatibility systemEvolution 77:907–920https://doi.org/10.1093/evolut/qpac065
- 33.Origin and diversification dynamics of self-Incompatibility haplotypesGenetics 188:625–636https://doi.org/10.1534/genetics.111.127399
- 34.The evolution of restricted recombination in sex chromosomesTrends in Ecology & Evolution 24:94–102https://doi.org/10.1016/j.tree.2008.09.010
- 35.How to make a sex chromosomeNat Commun 7https://doi.org/10.1038/ncomms12087
- 36.Why do sex chromosomes stop recombining?Trends Genet 34:492–503https://doi.org/10.1016/j.tig.2018.04.001
- 37.Low mutation load in a supergene underpinning alternative male mating strategies in ruffhttps://doi.org/10.1101/2022.04.27.489720
- 38.Complex basis of hybrid female sterility and Haldane’s rule in Heliconius butterflies: Z-linkage and epistasisMolecular Ecology 31:959–977https://doi.org/10.1111/mec.16272
- 39.Evolution of self-incompatibility in the Brassicaceae: Lessons from a textbook example of natural selectionEvolutionary Applications https://doi.org/10.1111/eva.12933
- 40.Evolution of dominance in sporophytic self-incompatibility systems: I. Genetic load and coevolution of levels of dominance in pollen and pistilEvolution 63:2427–2437https://doi.org/10.1111/j.1558-5646.2009.00709.x
- 41.Genotyping and de novo discovery of allelic variantsattheBrassicaceaeself-incompatibilitylocusfromshortreadsequencingdataMolBiol Evol https://doi.org/10.1093/molbev/msz258
- 42.Recombination, balancing selection and phylogenies in MHC and self-incompatibility genesGenetics 159:1833–1844https://doi.org/10.1093/genetics/159.4.1833
- 43.Estimating the number, frequency, and dominance of S -alleles in a natural population of Arabidopsis lyrata (Brassicaceae) with sporophytic control of self-incompatibilityHeredity 90:422–431https://doi.org/10.1038/sj.hdy.6800261
- 44.Uneven segregation of sporophytic self-incompatibility alleles in Arabidopsis lyrataJournal of Evolutionary Biology 17:554–561https://doi.org/10.1111/j.1420-9101.2004.00699.x
- 45.Phenotypic and genotypic expression of self-incompatibility haplotypes in Arabidopsis lyrata suggests unique origin of alleles in different dominance classesMolecular Biology and Evolution 22:1609–1620https://doi.org/10.1093/molbev/msi153
- 46.Fast gapped-read alignment with Bowtie 2Nat Methods 9:357–359https://doi.org/10.1038/nmeth.1923
- 47.1000 Genome Project Data Processing SubgroupThe Sequence Alignment/Map format and SAMtools. Bioinformatics 25:2078–2079https://doi.org/10.1093/bioinformatics/btp352
- 48.BEDTools: a flexible suite of utilities for comparing genomic featuresBioinformatics 26:841–842https://doi.org/10.1093/bioinformatics/btq033
- 49.A framework for variation discovery and genotyping using next-generation DNA sequencing dataNat Genet 43:491–498https://doi.org/10.1038/ng.806
- 50.The variant call format and VCFtoolsBioinformatics 27:2156–2158https://doi.org/10.1093/bioinformatics/btr330
- 51.Evidence for widespread positive and negative selection in coding and conserved noncoding regions of Capsella grandifloraPLOS Genetics 10https://doi.org/10.1371/journal.pgen.1004622
- 52.New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0Syst Biol 59:307–321https://doi.org/10.1093/sysbio/syq010
- 53.Estimation of the number of nucleotide substitutions in the control region of mitochondrial DNA in humans and chimpanzeesMolecular Biology and Evolution 10:512–526https://doi.org/10.1093/oxfordjournals.molbev.a040023
- 54.MEGA X: Molecular evolutionary genetics analysis across computing platformsMolecular Biology and Evolution 35:1547–1549https://doi.org/10.1093/molbev/msy096
- 55.DnaSP 6: DNA sequence polymorphism analysis of large data setsMolecular Biology and Evolution 34:3299–3302https://doi.org/10.1093/molbev/msx248
- 56.Association mapping in structured populationsThe American Journal of Human Genetics 67:170–181https://doi.org/10.1086/302959
- 57.Documentation for structure software: Version 2.3Chicago, IL: University of Chicago :1–37
- 58.NIH Image to ImageJ: 25 years of image analysisNat Methods 9:671–675https://doi.org/10.1038/nmeth.2089
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
- Version of Record published:
Copyright
© 2024, Le Veve et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 478
- downloads
- 60
- citations
- 2
Views, downloads and citations are aggregated across all versions of this paper published by eLife.