1. Genetics and Genomics
Download icon

Extensive impact of low-frequency variants on the phenotypic landscape at population-scale

  1. Téo Fournier
  2. Omar Abou Saada
  3. Jing Hou
  4. Jackson Peter
  5. Elodie Caudal
  6. Joseph Schacherer  Is a corresponding author
  1. Université de Strasbourg, CNRS, GMGM UMR 7156, France
Research Article
  • Cited 2
  • Views 1,181
  • Annotations
Cite this article as: eLife 2019;8:e49258 doi: 10.7554/eLife.49258

Abstract

Genome-wide association studies (GWAS) allow to dissect complex traits and map genetic variants, which often explain relatively little of the heritability. One potential reason is the preponderance of undetected low-frequency variants. To increase their allele frequency and assess their phenotypic impact in a population, we generated a diallel panel of 3025 yeast hybrids, derived from pairwise crosses between natural isolates and examined a large number of traits. Parental versus hybrid regression analysis showed that while most phenotypic variance is explained by additivity, a third is governed by non-additive effects, with complete dominance having a key role. By performing GWAS on the diallel panel, we found that associated variants with low frequency in the initial population are overrepresented and explain a fraction of the phenotypic variance as well as an effect size similar to common variants. Overall, we highlighted the relevance of low-frequency variants on the phenotypic variation.

https://doi.org/10.7554/eLife.49258.001

Introduction

Natural populations are characterized by an astonishing phenotypic diversity. Variation observed among individuals of the same species represents a powerful raw material to develop better insight into the relationship existing between genetic variants and complex traits (Mackay et al., 2009). The recent advances in high-throughput sequencing and phenotyping technologies greatly enhance the ability to determine the genetic basis of traits in various organisms (Alonso-Blanco et al., 2016; Auton et al., 2015; Mackay et al., 2012; Peter et al., 2018). Dissection of the genetic mechanisms underlying natural phenotypic diversity is within easy reach when using classical mapping approaches such as linkage analysis and genome-wide association studies (GWAS) (Mackay et al., 2009; Visscher et al., 2017). Alongside these major advances, however, it must be noted that there are some limitations. All genotype-phenotype correlation studies in humans and other model eukaryotes have identified causal loci in GWAS explaining relatively little of the observed phenotypic variance of most complex traits (Eichler et al., 2010; Hindorff et al., 2009; Manolio et al., 2009; Shi et al., 2016; Stahl et al., 2012; Wood et al., 2014; Zuk et al., 2014).

Despite the efforts made to find the genetic variants responsible for complex traits, the variants found explain only a small part of the heritability, that is of the fraction of the phenotypic variance explained by the underlying genetic variability. One of the most striking examples is observed with human height. This trait is estimated to be 60–80% heritable (Speed et al., 2017; Visscher et al., 2008) but close to 700 variants found in an analysis based on more than 250,000 individuals only explain 20% of this total heritability (Wood et al., 2014). Multiple justifications for this so-called missing heritability have been suggested, including the presence of low-frequency variants, (Gibson, 2012; Hindorff et al., 2009; Manolio et al., 2009; Pritchard, 2001; Walter et al., 2015), structural variants (e.g. copy number variants) (Peter et al., 2018), small effect variants, as well as the low power to estimate non-additive effects (Cordell, 2009; Mackay, 2014; Zuk et al., 2012).

Variants present in less than 5% of the individuals are coined as low-frequency variants and are known to be involved in a large number of rare Mendelian disorders (Gibson, 2012). However, implication of rare variants is also pervasive in common diseases and other complex traits. Assessing the impact and effect of low-frequency variants at a population scale and on a large phenotypic spectrum will allow to gain better insight into the genetic architecture of the phenotypic variation in a species. As GWAS cannot deal with low-frequency and rare variants due to statistical limitations, except for very large sample sizes, their effect has often been overlooked.

Among model organisms, the budding yeast Saccharomyces cerevisiae is especially well suited to dissect variations observed across natural populations (Fay, 2013; Peter and Schacherer, 2016). S. cerevisiae isolates can be found in a broad array of biotopes both human-associated (e.g. wine, sake, beer and other fermented beverages, food, human body) or wild (e.g. plants, soil, insects) and are distributed world-wide (Peter et al., 2018). Phenotypic diversity among yeast isolates is significant and the S. cerevisiae species presents a high level of genetic diversity (π = 3×10−3), much greater than that found in humans (Lek et al., 2016). Because of their small and compact genomes, an unprecedented number of 1,011 S. cerevisiae natural isolates has recently been sequenced (Peter et al., 2018). Yeast genome-wide association analyses have revealed functional Single Nucleotide Polymorphisms (SNPs), explaining a small fraction of the phenotypic variance (Peter et al., 2018). However, these analyses highlighted the importance of the copy number variants (CNVs), which account for a larger proportion of the phenotypic variance and have greater effects on phenotypes compared to the SNPs. Nevertheless, even when CNVs and SNPs are taken together, the phenotypic variance explained is still low (approximately 17% on average) and consequently a large part of it is unexplained.

Interestingly, much of the detected genetic polymorphisms in the 1011 yeast genomes dataset are low-frequency variants with almost 92.7% of the polymorphic sites associated with a minor allele frequency (MAF) lower than 0.05. This trend is similar to that observed in the human population (Auton et al., 2015; Walter et al., 2015) and definitely raised a question regarding the impact of low-frequency variants on the phenotypic landscape within a population and on the missing heritability (Zuk et al., 2014). Here, we investigated the underlying genetic architecture of phenotypic variation as well as unraveling part of the missing heritability by accounting for low-frequency genetic variants at a population-wide scale and non-additive effects controlled by a single locus. For this purpose, we generated and examined a large set of traits in 3025 hybrids, derived from pairwise crosses between a subset of natural isolates from the 1,011 S. cerevisiae population. This diallel crossing scheme allowed us to capture the fraction of the phenotypic variance controlled by both additive and non-additive phenomena as well as infer the main modes of inheritance for each trait. We also took advantage of the intrinsic power of this diallel design to perform GWAS and assess the role of the low-frequency variants on complex traits.

Results

Diallel panel and phenotypic landscape

Based on the genomic and phenotypic data from the 1,011 S. cerevisiae isolate collection (Peter et al., 2018), we selected a subset of 55 isolates that were diploid, homozygous, genetically diverse (Figure 1a), and originated from a broad range of ecological sources (Figure 1b) (e.g. tree exudates, Drosophila, fruits, fermentation processes, clinical isolates) as well as geographical origins (Europe, America, Africa and Asia) (Figure 1c and Supplementary file 1). A full diallel cross panel was constructed by systematically crossing the 55 selected isolates in a pairwise manner (Figure 1d). In total, we generated 3025 hybrids, representing 2970 heterozygous hybrids with a unique parental combination and 55 homozygous hybrids. All 3025 hybrids were viable, indicating no dominant lethal interactions existed between the parental isolates. We then screened the entire set of the parental isolates and hybrids for quantification of mitotic growth abilities across 49 conditions that induce various physiological and cellular responses (Figure 1—figure supplement 1, Figure 1—figure supplement 2, Supplementary file 2). We used growth as a proxy for fitness traits (see Materials and methods). Ultimately, this phenotyping step led to the characterization of 148,225 hybrid/trait combinations.

Figure 1 with 3 supplements see all
Diversity of the 55 selected natural isolates and diallel design.

(a) Pairwise sequence diversity between each pair of parental strains. (b) Ecological origins of the selected strains. See also Supplementary file 1. (c) Geographical origins of the selected strains. (d) Generation of the diallel hybrid panel. 55 natural isolates available as both mating types as stable haploids were crossed in a pairwise manner to obtain 3025 hybrids. This panel was then phenotyped on 49 growth conditions impacting various cellular processes.

https://doi.org/10.7554/eLife.49258.002

Estimation of genetic variance components using the diallel panel (additive vs. non-additive)

The diallel cross design allows for the estimation of additive vs. non-additive genetic components contributing to the variation in each trait by calculating the combining abilities following Griffing’s model (Griffing, 1956). For each trait, the General Combining Ability (GCA) for a given parent refers to the average fitness contribution of this parental isolate across all of its corresponding hybrid combinations, whereas the Specific Combining Ability (SCA) corresponds to the residual variation unaccounted for from the sum of GCAs from the parental combination. Consequently, the phenotype of a given hybrid can be formulated as µ + GCAparent1 + GCAparent2 + SCAhybrid, where µ is the mean fitness of the population for a given trait. We found a near perfect correlation (Pearson’s r = 0.995, p-value<2.2e-16) between expected and observed phenotypic values, confirming the accuracy of the model used (see Materials and methods). Using GCA and SCA values, we estimated both broad- (H2) and narrow-sense (h2) heritabilities for each trait (Figure 1). Broad-sense heritability is the fraction of phenotypic variance explained by genetic contribution. In a diallel cross, the total genetic variance is equal to the sum of the GCA variance of both parents and the SCA variance in each condition. Narrow-sense heritability refers to the fraction of phenotypic variance that can be explained only by additive effects and corresponds to the variance of the GCA in each condition (Figure 2a). The H2 values for each condition ranged from 0.64 to 0.98, with the lowest value observed for fluconazole (1 µg.ml−1) and the highest for sodium meta-arsenite (2.5 mM), respectively. The additive part (h2 values) ranged from 0.12 to 0.86, with the lowest value for fluconazole (1 µg.ml−1) and the highest for sodium meta-arsenite (2.5 mM), respectively. While broad- and narrow-sense heritabilities are variable across conditions, we also observed that on average, most of the phenotypic variance can be explained by additive effects (mean h2 = 0.55). However, non-additive components contribute significantly to some traits, explaining on average one third of the phenotypic variance observed (mean H2 - h2 = 0.29) (Figure 2b). Despite a good correlation between broad- and narrow-sense heritabilities (Pearson’s r = 0.809, p-value=1.921e-12) (Figure 2c), some traits display a larger non-additive contribution, such as in galactose (2%) or ketoconazole (10 µg/ml). Interestingly, we revealed that these two conditions revealed to be mainly controlled by dominance (see below). Altogether, our results highlight the main role of additive effects in shaping complex traits at a population-scale and clearly show that this is not restricted to the single yeast cross where this trend was first observed (Bloom et al., 2013; Bloom et al., 2015). Nonetheless, non-additive effects still explain a third of the observed phenotypic variance. This result also corroborates at a species-wide level the extensive impact of non-additive effects on phenotypic variance (Forsberg et al., 2017; Yadav et al., 2016).

Heritability measurements.

(a) The whole bar represents the overall heritability (H2) for each condition tested. Orange part of the bars represents the narrow-sense heritability h2, that is the fraction of phenotypic variance explained by additive effects, while blue part depicts the fraction of phenotypic variance explained by non-additive effects. (b) Overall mean additive and non-additive effects for every tested growth condition. (c) Representation of H2 as a function of h2 showing the relative additive versus non-additive effects for each condition. Outlier conditions in terms of non-additive variance will lie further away from the linear regression line. Person’s r (95% confidence interval: 0.684–0.889) with the corresponding p-value is displayed.

https://doi.org/10.7554/eLife.49258.007

Relevance of dominance for non-additive effects

To have a precise view of the non-additive components, the mode of inheritance and the relevance of dominance for genetic variance, we focused on the deviation of the hybrid phenotypes from the expected value under a full additive model. Under this model, the hybrid phenotype is expected to be equal to the mean between the two parental phenotypes, hereinafter referred as Mean Parental Value or Mid-Parent Value (MPV). Deviation from this MPV allowed us to infer the respective mode of inheritance for each hybrid/condition combination (Lippman and Zamir, 2007), that is additivity, partial or complete dominance towards one or the other parent and finally overdominance or underdominance (Figure 3a–b, see Materials and methods). Only 17.4% of all hybrid/condition combinations showed enough phenotypic separation between the parents and the corresponding hybrid, allowing the complete partitioning in the seven above-mentioned modes of inheritance. For the 82.6% remaining cases, only a separation of overdominance and underdominance can be achieved (Figure 3c). Interestingly, these events are not as rare as previously described (Zörgö et al., 2012), with 11.6% of overdominance and 10.1% of underdominance (Figure 3d). When a clear separation is possible (Figure 3e), one third of the condition/cross combinations detected were purely additive whereas the rest displayed a deviation towards one of the two parents, with no bias (Figure 3e). When looking at the inheritance mode in each condition, most of the studied growth conditions (32 out of 49) showed a prevalence of additive effects (Figure 3f). However, 17 conditions were not predominantly additive throughout the population. Indeed, a total of 12 conditions were detected as mostly dominant with 4 cases of best parent dominance, including galactose (2%) and ketoconazole (10 µg.ml−1), and 8 of worst parent dominance. The remaining five conditions displayed a majority of partial dominance (Figure 3f). These results confirm the importance of additivity in the global architecture of traits, but more importantly, they clearly demonstrate the major role of dominance as a driver for non-additive effects. Nevertheless, the presence of conditions with a high proportion of partial dominance combined with the cases of over and underdominance may indicate a strong and pervasive impact of epistasis on phenotypic variation.

Mode of inheritance.

(a) Representation of the different mode of inheritance depending on the hybrid value when a separation can be achieved between parental strains and (b) if a clear separation cannot be achieved between parental strains. (c) Percentage of parental phenotypes separated from each other for which a complete partition of different inheritance modes can be achieved. (d) Inheritance modes for every cross and condition where no separation can be achieved between the two homozygous parents. e. Inheritance modes for every cross and condition where a clear phenotypic separation can be achieved between the two homozygous parents. (f) The number of conditions in each main inheritance mode.

https://doi.org/10.7554/eLife.49258.008

Diallel design allows mapping of low-frequency variants in the population using GWAS

Next, we explored the contribution of low-frequency genetic variants (MAF <0.05) to the observed phenotypic variation in our population. Genetic variants considered by GWAS must have a relatively high frequency in the population to be detectable, usually over 0.05 for relatively small datasets (Visscher et al., 2017). Consequently, low-frequency variants are evicted from classical GWAS. However, the diallel crossing scheme stands as a powerful design to assess the phenotypic impact of low-frequency variants present in the initial population as each parental genome is presented several times, creating haplotype mixing across the matrix and preserving the detection power in GWAS.

To avoid issues due to population structure, we selected a subset of hybrids from 34 unrelated isolates in the original panel to perform GWAS (see Materials and methods, Supplementary file 1). By combining known parental genomes, we constructed 595 hybrid genotypes in silico, matching one half matrix of the diallel plus the 34 homozygous diploids. We built a matrix of genetic variants for this panel and filtered SNPs to only retain biallelic variants with no missing calls. In addition, due to the small number of unique parental genotypes, extensive long-distance linkage disequilibrium was also removed (see Materials and methods), leaving a total of 31,632 polymorphic sites in the diallel population. Overall, 3.8% (a total of 1,180 SNPs) had a MAF lower than 0.05 in the initial population of the 1,011 S. cerevisiae isolates but surpassed this threshold in the diallel panel, reaching a MAF of 0.32 (Figure 4a–b).

Figure 4 with 1 supplement see all
Rare and low-frequency variants detection.

(a) Comparison of MAF for each SNP between the whole population (1011 strains) and the hybrid diallel matrix used for GWAS. Hollow blue circles represent the MAF of all SNPs common to the initial population and the diallel hybrids (31,632). Full orange circles show the MAF of significantly associated SNPs. Vertical orange line shows the 5% MAF threshold. (b) Proportion of SNPs with a MAF below 0.05. (c) Proportion of significantly associated SNPs with a MAF below 0.05. (d) Fraction of heritability explained for common and low-frequency variants. P-value was calculated using a two-sided Mann-Whitney-Wilcoxon test, difference in location of −4.5e−3 (95% confidence interval −7.9e−3 -1.4e−3). (e) Absolute effect size of common and low-frequency variants.

https://doi.org/10.7554/eLife.49258.009

To map additive as well as non-additive variants impacting phenotypic variation, we performed GWA using two different models (Seymour et al., 2016) (see Materials and methods). We used a classical additive model, encoding for SNPs where linear relationship between trait and genotype is assessed, that is every locus has a different encoding for each genotype. To account for non-additive inheritance, we also used an overdominant model, which only considers differences between heterozygous and homozygous thus revealing overdominant and dominant effects. For each of these two models, we performed mixed-model association analysis of the 49 growth conditions with FaST-LMM (Lippert et al., 2011; Widmer et al., 2015). Overall, GWAS revealed 1723 significantly associated SNPs (Figure 4—source data 1) by detecting from 2 to 103 significant SNPs by condition, with an average of 39 SNPs per condition. Minor allele frequencies of the significantly associated SNPs were determined in the 1011 sequenced genomes, from which the diallel parents were selected (Figure 4). Interestingly, 16.3% of the significant SNPs (281 in total) corresponded to low-frequency variants (MAF <0.05), with 19.5% of them (55 SNPs) being rare variants (MAF <0.01). This trend is the same and maintained for both models, with 19.3% and 15.2% of low-frequency variants for the additive and overdominant models, respectively. Due to the scheme used, it is important to note that it is possible to increase the MAF of low-frequency variants at a detectable threshold in the diallel panel and to query their effects but it is still difficult for truly rare variants (MAF <0.01), probably leading to an underestimation. However, these results clearly show that low-frequency variants indeed play a significant part in the phenotypic variance at the population-scale. We then estimated the contribution of the significant variants to total phenotypic variation (see Materials and methods) in our panel and found that detected SNPs could explained 15% to 32% of the variance, with a median of 20% (Figure 4d). When looking at the variance explained by each variant over their respective allele frequency, it is noteworthy that low-frequency variants explained roughly the same proportion of the phenotypic variation (median of 20.2%) than the common SNPs (median of 19.6%) (Figure 4d). In addition, the variance explained by the associated rare variants were also higher on average than the rest of the detected SNPs (Figure 4—figure supplement 1a). It is noteworthy that this trend was robust and conserved across the two encoding models implemented, accounting for additive and overdominant effects (Figure 4—figure supplement 1a). However, these results cannot be extrapolated to the whole population and only hold in the scope of our diallel population where these variants are now overrepresented compared to the natural population. Indeed, variance explained is related to the surveyed population because its value relies on the MAF of the variants. Therefore, in the whole natural population of 1011 isolates, their contribution to the phenotypic variance will be less important because of their lower MAF. To obtain a value that is unrelated to the studied population, we measured their respective effect size (Figure 4e). Here again we found that on average, low-frequency variant have about the same effect size (mean of 0.23 sd) than the common variants (mean of 0.25 sd).

To gain insight into the biological relevance of the set of associated SNPs, we first examined their distribution across the genome and found that 62.5% of them are in coding regions (with coding regions representing a total of 72.9% of the S. cerevisiae genome) (Figure 4—figure supplement 1b), with all of these SNPs distributed over a set of 546 genes. Over the last decade, an impressive number of quantitative trait locus (QTL) mapping experiments were performed on a myriad of phenotypes in yeast leading to the identification of 145 quantitative trait genes (QTG) (Peltier et al., 2019) and we found that 19 of the genes we detected are included in this list (Figure 4—figure supplement 1c). In addition, 22 associated genes were also found as overlapping with a recent large-scale linkage mapping survey in yeast (Bloom et al., 2019) (Figure 4—figure supplement 1c). We then asked whether the associated genes were enriched for specific gene ontology (GO) categories (Supplementary file 3). This analysis revealed an enrichment (p-value=5.39×10−5) in genes involved in ‘response to stimulus’ and ‘response to stress’, which is in line with the different tested conditions leading to various physiological and cellular responses.

SGD1 and the mapping of a low-frequency variant

Finally, we focused on one of the most strongly associated genetic variant out of the 281 low-frequency variants significantly associated with a phenotype. The chosen variant was characterized by two adjacent SNPs in the SGD1 gene and was detected in 6-azauracile (100 µg.ml−1) with a p-value of 2.75e-8 with the overdominant encoding and 6.26e-5 with the additive encoding. Their MAF in the initial population is only 2.5% and reached 9% in the diallel panel with three genetically distant strains carrying it (Figure 5a). The SNPs are in the coding sequence of SGD1, an essential gene encoding a nuclear protein. The minor allele (AA) induces a synonymous change (TTG (Leu) → TTA (Leu)) for the first position and a non-synonymous mutation (GAA (Glu)→ AAA (Lys)) for the second position (Figure 5a). The phenotypic advantage conferred by this allele was observed with a significant difference between the homozygous for the minor allele, heterozygous and homozygous for the major allele (Figure 5b). To functionally validate the phenotypic effect of this low-frequency variant, CRISPR-Cas9 genome-editing was used in the three strains carrying the minor allele (AA) in order to switch it to the major allele (GG) and assess its phenotypic impact. Both mating types have been assessed for each strain. When phenotyping the wildtype strains containing the minor allele and the mutated strains with the major allele, we observed that the minor allele confers a phenotypic advantage of 0.2 in growth ratio compared to the major allele (Figure 5c) therefore validating the important phenotypic impact of this low-frequency variant. However, no assumptions can be made regarding the exact effect of this allele at the protein-level because no precise characterization has ever been carried out on Sgd1p and no particular domain has been highlighted.

Low-frequency variant functional validation in 6-azauracil 100 µg.ml−1.

(a) Schematic representation of SGD1 with the relative position of the detected SNPs. The minor allele is represented in orange with its MAF in the population and in the diallel cross panel. (b) boxplot and density plot of the normalized growth ratios for each genotype on 6-azauracil 100 µg.ml−1. Number of observation is displayed in the boxplots. (c) Phenotypic validation after allele replacement of the minor allele with the major allele using CRISPR-Cas9 in the strains carrying the minor allele. Error bars represent median absolute deviation (four replicates).

https://doi.org/10.7554/eLife.49258.012

Discussion

Understanding the source of the missing heritability is essential to precisely address and dissect the genetic architecture of complex traits. Over the years, the diallel hybrid panel design has proven its strength to dissect part of the genetic architecture of traits in populations. One of the main advantages of using such experimental design is the ability to precisely isolate the part of phenotypic variance that is controlled by additive effects from the one controlled by non-additive effects. While our analysis revealed that an important part of the phenotypic variance is linked to additive effects, about a third remains ruled by non-additive interactions encompassing dominance and epistasis. These results are in line with previous findings.

However, care should be taken with the classification of the mode of inheritance. Indeed, as we do not know how many loci are involved for each hybrid’s phenotype, we can only assess the final phenotypic outcome of all the genetic variants involved and not on a locus by locus basis. This classification does not take into account their number, effect size and interactions. Consequently, the mode of inheritance that we described here solely reflects how the phenotype of the hybrid varies with respect to its parents. For example, several interactions could take place with opposite effect, leading to a final phenotype that appears as being controlled by an additive mode of inheritance (i.e. the hybrid phenotype equal to the mid parent value). However, in the cases where dominance was detected as a mode of inheritance, this might reflect the presence of a single locus having a strong phenotypic impact acting dominantly thus being responsible by itself for the phenotype. Yet, if two hybrids show a complete dominance in the same condition, it does not mean that the same alleles are involved in both.

Although few low-frequency and rare variants were considered in our GWAS (4%) due to stringent filtering conditions, a strong enrichment in these variants has been observed in the significantly associated ones (16%), demonstrating the ubiquity of low-frequency variants with important phenotypic impact. However, when looking at the population level, even though they do have effect sizes similar to common variants, they are not going to explain an important part of variance because it relies both on effect size and allele frequency. A good example of this phenomenon has been seen with a study of human height in more than 700,000 individuals. A total of 83 significantly associated rare and low-frequency variants with effect sizes up to 2 cm have been mapped (Marouli et al., 2017). On average, they explained the same amount of phenotypic variation as common variants, which displayed much smaller effect sizes of about 1 mm. Our results suggest that a high number of low-frequency variants play a decisive role in the phenotypic landscape of a population both in term of number and effect size. Taken one by one, they do not explain a lot of phenotypic variance in a large population. Yet, altogether, they might actually explain a greater part of the variation than the one explained by common variants.

The contribution of rare and low-frequency variants to traits is largely unexplored. In humans, these genetic variants are widespread but only a few of them have been associated with specific traits and diseases (Walter et al., 2015). Recently, it has been shown that the missing heritability of height and body mass index is accounted for by rare variants (Wainschtein et al., 2019). We also recently found in yeast that most of the previously identified Quantitative Trait Nucleotides (QTNs) using linkage mapping were at low allele frequency in the 1,011 S. cerevisiae population (Hou et al., 2016; Hou et al., 2019; Peltier et al., 2019; Peter et al., 2018). A total of 284 QTNs were identified by linkage mapping and 150 of them are present at a low frequency in the population of 1011 isolates (Peltier et al., 2019; Peter et al., 2018). However, these QTNs were mapped with mostly closely related genetic backgrounds, encompassing a total of 59 strains with 30% of them coming from laboratory and 41% coming from the wine cluster, which has a very low genetic diversity (Peter et al., 2018). Moreover, experimentally validated QTNs are, most of the time, genetic variants with the most important phenotypic impact, which has been previously recognized as inducing an ascertainment bias (Rockman, 2012). It also raised the question of whether these rare and large effect size alleles discovered in specific crosses are really relevant to the variation across most of the population.

Here, we quantified the contribution of low-frequency variants across a large number of growth conditions and found that among all the genetic variants detected by GWAS on a diallel panel, 16.3% of them have a low-frequency in the initial population and explain a significant part of the phenotypic variance (21% on average). This particular diallel design also presents an intrinsic power to evaluate the additive vs. non-additive genetic components contributing to the phenotypic variation. We assessed the effect of intra-locus dominance on the non-additive genetic component and showed that dominance at the single locus level contributed to the phenotypic variation observed. However, other more complicated inter-loci interactions may still be involved. Altogether, these results have major implications for our understanding of the genetic architecture of traits in the context of unexplained heritability. In parallel to a recent large-scale linkage mapping survey in yeast (Bloom et al., 2019), our study highlights the extensive role of low-frequency variants on the phenotypic variation.

Materials and methods

Construction of the diallel panel

Selection of the S. cerevisiae isolates

Request a detailed protocol

Out of the collection of 1011 strains (Peter et al., 2018), a total of 53 natural isolates were carefully selected to be representative of the S. cerevisiae species. We selected isolates from a broad ecological origins and we prioritized for strains that were diploid, homozygous, euploid and genetically as diverse as possible, that is up to 1% of sequence divergence. All the isolate details, including ecological and geographical origins, are listed in Supplementary file 1. In addition to these 53 isolates, we included two laboratory strains, namely ∑1278b and the reference S288c strain.

Generation of stable haploids

Request a detailed protocol

For each selected parental strain, stable haploid strains were obtained by deleting the HO locus. The HO deletions were performed using PCR fragments containing drug resistance markers flanked by homology regions up and down stream of the HO locus, using standard yeast transformation method. Two resistance cassettes, KanMX and NatMX, were used for MATa and MATα haploids, respectively. The mating-type (MATa and MATα) of antibiotic-resistant clones was determined using testers of well-known mating type. For each genetic background, we selected a MATa and MATα clone that are resistant to G418 or nourseothricin, respectively.

Phenotyping of the parental haploid strains was performed to check for mating type-specific fitness effects. All MATa and MATα parental strains were tested on all 49 growth conditions (see below) using the same procedure as the phenotyping assay of the hybrid matrix. The overall correlation between the MATa and MATα parental strains was 0.967 (Pearson, p-value<1e-324), with an average correlation per strain of 0.976 across different conditions (Figure 1—figure supplement 3). No significant mating type specificity was identified.

Diallel scheme

Request a detailed protocol

Parental strains were arrayed and pregrown in liquid YPD (1% yeast extract, 2% peptone and 2% glucose) overnight. Mating was performed with ROTOR (Singer Instruments) by pinning and mixing MATa over MATα parental strains on solid YPD. The parental strains, that is 55 MATa HO::∆KanMX and 55 MATα HO::∆NatMX strains were arrayed and mated in a pairwise manner on YPD for 24 hr at 30°C. The mating mixtures were replicated on YPD supplemented with G418 (200 µg.ml−1) and nourseothricin (100 µg.ml−1) for double selection of hybrid individuals. After 24 hr, plates were replicated again on the same media to eliminate potential residuals of non-hybrids cells. In total, we generated 3025 hybrids, representing 2970 heterozygous hybrids with a unique parental combination and 55 homozygous hybrids.

High-throughput phenotyping and growth quantification

Request a detailed protocol

Quantitative phenotyping was performed using endpoint colony growth on solid media. Strains were pregrown in liquid YPD medium and pinned onto a solid SC (Yeast Nitrogen Base with ammonium sulfate 6.7 g.l−1, amino acid mixture 2 g.l−1, agar 20 g.l−1, glucose 20 g.l−1) matrix plate to a 1536 density format using the replicating ROTOR robot (Singer Instruments). Two biological replicates (coming from independent cultures) of each parental haploid strain were present on every plate and six biological replicates were present for each hybrid. As 27 plates were used in order to phenotype all the hybrids, 27 technical replicates (same culture in different plates) of the parents were present. The resulting matrix plates were incubated overnight to allow sufficient growth, which were then replicated onto 49 media conditions, plus SC as a pinning control (Figure 1—figure supplement 1, Supplementary file 2). The selected conditions impact a broad range of cellular responses, and multiple concentrations were tested for each compound (Figure 1—figure supplement 2). Most tested conditions displayed distinctive phenotypic patterns, suggesting different genetic basis for each of them (Figure 1—figure supplement 2). The plates were incubated for 24 hr at 30°C (except for 14°C phenotyping) and were scanned with a resolution of 600 dpi at 16-bit grayscale. Quantification of the colony size was performed using the R package Gitter (Wagih and Parts, 2014) and the fitness of each strain on the corresponding condition was measured by calculating the normalized growth ratio between the colony size on a condition and the colony size on SC. As each hybrid is present in six replicates, the value considered for its phenotype is the median of all its replicates, thus smoothing the effects of pinning defect or contamination. This phenotyping step led to the determination of 148,225 hybrid/trait combinations (Figure 1—source data 1).

Diallel combining abilities and heritabilities

Request a detailed protocol

Combining ability values were calculated using half diallel with unique parental combinations, excluding homozygous hybrids from identical parental strains. For each hybrid individual, the fitness value is expressed using Griffing’s model (Griffing, 1956):

zij=μ+gi+ gj+sij+e

Where zij is the fitness value of the hybrid resulting from the combination of ith and jth parental strains, zij is the mean population fitness, μ and gi are the general combining ability for the ith and jth parental strains, gj is the specific combining ability associated with the sij hybrid, and e is the error term (i = 1...N, j = 1…N, N = 55). General combining ability for the ith parent is calculated as:

i×j

Where N is the total number of parental types, gi^=N-1N-2×zi¯-μ is the mean fitness value of all half sibling hybrids involving the ith parent, and zi- is the population mean. The error term associated with μ is:

gi

Where N is the total number of parental types, n is the number of replicates for the egi=N-1×σ2zijn×N×N-2 hybrid, and i×j is the variance of fitness values from a full-sib family involving the ith and jth parents, which is expressed as:

σ2zij

Specific combining ability for the σ2zij=σ2zi+σ2zj+σ2zij+2×covzi,zj hybrid combination therefore:

i×j

The error term associated with sij^=zij-- gi^-gj^-μ is:

sij^

Using combining ability estimates, broad- and narrow-sense heritabilities can be calculated. Narrow sense heritability (h2) accounts for the part of phenotypic variance explained only by additive variance, expressed as the additive variance (esij=N-3×σ2zijn×N-1) over the total phenotypic variance observed (σA2):

σP2

Where h2=σA2σP2=σ(gi+gj)2σ(gi+gj)2+σsij2+σe2 is the sum of GCA variances, σ(gi+gj)2 is the SCA variance and σsij2 is the variance due to measurement error, which is expressed as:

σe2

On the other hand, broad-sense heritability (H2) depicts the part of the phenotypic variance explained by the total genetic variance σe2=N-2egi-+egj--2+N2-N2-1N2-N2+N-3× esij-2:

σG2

Phenotypic variance explained by non-additive variance is therefore equal to the difference between H2 and h2. All calculations were performed in R using custom scripts.

Calculation of mid-parent values and classification of mode of inheritance

Request a detailed protocol

Mid-Parent Value (MPV) is expressed as the mean fitness value of both diploid homozygous parental phenotypes:

H2=σG2σP2=σgi+gj2+σsij2σ(gi+gj)2+σsij2+σe2

Comparing the hybrid phenotypic value (Hyb) to its respective parents’ allows for an inference of the mode of inheritance for each hybrid/trait combination (Figure 3a–b). To obtain a robust classification, confidence intervals for each class were based on the standard deviation of hybrid (six replicates) and parents (54 replicates). P2 is the phenotypic value of the fittest parent while P1 is the phenotypic value of the least fit parent.

Inheritance modeFormula
UnderdominanceHyb<P1(σP1+σHyb)
Dominance P1P1(σP1+σHyb)<Hyb<P1+(σP1+σHyb)
Partial dominance P1P1+(σP1+σHyb)<Hyb<MPV(σP1+σP22+σHyb)
AdditivityMPV+(σP1+σP22+σHyb)<Hyb<P2(σP2+σHyb)
Partial dominance P2MPV(σP1+σP22+σHyb)<Hyb<MPV+(σP1+σP22+σHyb)
Dominance P2P2(σP2+σHyb)<Hyb<P2+(σP2+σHyb)
OverdominanceP2+(σP2+σHyb)<Hyb

When a clear separation is possible between the two parental phenotypic values (P1+σP1<P2σP2),the full decomposition in the seven above mentioned categories is possible (Figure 3a). However, in most of the cases, the two parental phenotypic values are not separated enough to achieve this but it is still possible to distinguish between overdominance and underdominance (Figure 3b, Figure 3d). All calculations were performed in R using custom scripts.

Genome-wide association studies on the diallel panel

Request a detailed protocol

Whole genome sequences for the parental strains were obtained from the 1002 yeast genome project (Peter et al., 2018). Sequencing was performed by Illumina Hiseq 2000 with 102 bases read length. Reads were then mapped to S288c reference genome using bwa (v0.7.4-r385) (Li and Durbin, 2009). Local realignment around indels and variant calling has been performed with GATK (v3.3–0) (McKenna et al., 2010). The genotypes of the F1 hybrids were constructed in silico using 34 parental genome sequences. We retained only the biallelic polymorphic sites, resulting in a matrix containing 295,346 polymorphic sites encoded using the ‘recode12’ function in PLINK (Chang et al., 2015). Those genotypes correspond to a half-matrix of pairwise crosses with unique parental combinations, including the diagonal,that is the 34 homozygous parental genotypes. For each cross, we combined the genotypes of both parents to generate the hybrid diploid genome. As a result, heterozygous sites correspond to sites for which the two parents had different allelic versions. We removed long-range linkage disequilibrium sites in the diallel matrix due to the low number of founder parental genotypes by removing haplotype blocks that are shared more than twice across the population, resulting in a final dataset containing 31,632 polymorphic sites.

We performed GWA analyses with different encodings (Seymour et al., 2016). In the additive model, the genotypes of the F1 progeny were simply the concatenation of the genotypes from the parents. As homozygous parental alleles were encoded as 1 or 2, the possible alleles for each site in the F1 genotype were ‘11’ and ‘22’ for homozygous sites and ‘12’ for heterozygous sites. We also used an overdominant genotype encoding, where both the homozygous minor and homozygous major alleles were encoded as ‘11’ and the heterozygous genotype was encoded as ‘22’.

Mixed-model association analysis was performed using the FaST-LMM python library version 0.2.32 (https://github.com/MicrosoftGenomics/FaST-LMM) (Widmer et al., 2015). We used the normalized phenotypes by replacing the observed value by the corresponding quantile from a standard normal distribution, as FaST-LMM expects normally distributed phenotypes. The command used for association testing was the following: single_snp(bedFiles, pheno_fn, count_A1 = True), where bedFiles is the path to the PLINK formatted SNP data and pheno_fn is the PLINK formatted phenotype file. By default, for each SNP tested, this method excludes the chromosome in which the SNP is found from the analysis in order to avoid proximal contamination. Fast-LMM also computes the fraction of heritability explained for each SNP. The mixed model adds a polygenic term to the standard linear regression designed to circumvent the effects of relatedness and population stratification.

We estimated a condition-specific p-value threshold for each condition by permuting phenotypic values between individuals 100 times. The significance threshold was the 5% quantile (the 5th lowest p-value from the permutations). With that method, variants passing this threshold will have a 5% family-wise error rate. However, we do not have any estimation of the false positive rate. Taken together, GWA revealed 1723 significantly associated SNPs (Figure 4—source data 1), with 1273 and 450 SNPs for overdominant and additive model, respectively.

Variance explained and effect size

Request a detailed protocol

Variance explained by each SNP is calculated by PLINK. Care must be taken that in order to obtain the variance explained by all SNPs, it is not possible to sum up the variance explained by each individual SNP based on the fact that SNPs are not completely independent from one another.

The effect size was calculated using the formula for Cohen's d:

P1+σP1<P2-σP2

Where the pooled standard deviation is calculated with the following formula:

sdPooled=sd12+sd222

Under the additive model, the heterozygote phenotype is equidistant to both possible homozygote phenotypes (minor allele and major allele), so our calculation of the effect size could either compare the heterozygotes with the homozygotes in the minor allele, or the heterozygotes with the homozygotes in the major alleles. We chose to use the latter since the major allele grants us more statistical power. The formula we used to obtain the effect size for a given SNP under this model is the following:

sdPooled=sd12+ sd222

Under the overdominant model, the heterozygote phenotype is compared to the phenotype of the group of both homozygotes (minor and major), so the formula we used to obtain the effect size for a given SNP under this model is the following:

Effect size=xHeterozygous--xMajor-sdPooled

Gene ontology analysis

Request a detailed protocol

GO term enrichment was performed using SGD GO Term Finder (https://www.yeastgenome.org/goTermFinder) with the 546 unique genes containing significantly associated SNPs (Figure 4—source data 1 and Supplementary file 3). Significant enrichment is considered under ‘Process’ ontology with a p-value cutoff of 0.05.

CRISPR-Cas9 allele editing

Request a detailed protocol

pAEF5 plasmid containing Cas9 endonuclease and the guide RNA targeting SGD1 was co-transformed with the repair fragment of 100 nucleotides containing the desired allele. Transformed cells were then plated on YPD supplemented with 200 µg.ml−1 hygromycin at 30°C to select for transformants. Colonies were then arrayed on a 96 well plate with 100 µl YPD and grown for 24 hr to induce plasmid loss. The plate was then pinned back onto solid YPD for 24 hr then replica plated to YPD supplemented with 200 µg.ml−1 hygromycin to check for plasmid loss. Allele specific PCR was performed on colonies that lost the plasmid (Wangkumhang et al., 2007) to distinguish correctly edited allele from wildtype allele. Strains who showed amplification for the edited allele and no amplification for the wildtype allele were phenotyped (four technical replicates and four biological replicates) on the corresponding condition to measure differences with their wildtype counterparts.

Statistical tests

Request a detailed protocol

Person’s correlation test was used to assess linear correlation between two sets.

Wilcoxon Mann Whitney was used to determine if two independent samples have the same distribution.

Correlogram of all tested growth conditions. Numbers in each cell represent 100 x Pearson’s r value.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
    Rare and low-frequency coding variants alter human adult height
    1. E Marouli
    2. M Graff
    3. C Medina-Gomez
    4. KS Lo
    5. AR Wood
    6. TR Kjaer
    7. RS Fine
    8. Y Lu
    9. C Schurmann
    10. HM Highland
    11. S Rüeger
    12. G Thorleifsson
    13. AE Justice
    14. D Lamparter
    15. KE Stirrups
    16. V Turcot
    17. KL Young
    18. TW Winkler
    19. T Esko
    20. T Karaderi
    21. AE Locke
    22. NG Masca
    23. MC Ng
    24. P Mudgal
    25. MA Rivas
    26. S Vedantam
    27. A Mahajan
    28. X Guo
    29. G Abecasis
    30. KK Aben
    31. LS Adair
    32. DS Alam
    33. E Albrecht
    34. KH Allin
    35. M Allison
    36. P Amouyel
    37. EV Appel
    38. D Arveiler
    39. FW Asselbergs
    40. PL Auer
    41. B Balkau
    42. B Banas
    43. LE Bang
    44. M Benn
    45. S Bergmann
    46. LF Bielak
    47. M Blüher
    48. H Boeing
    49. E Boerwinkle
    50. CA Böger
    51. LL Bonnycastle
    52. J Bork-Jensen
    53. ML Bots
    54. EP Bottinger
    55. DW Bowden
    56. I Brandslund
    57. G Breen
    58. MH Brilliant
    59. L Broer
    60. AA Burt
    61. AS Butterworth
    62. DJ Carey
    63. MJ Caulfield
    64. JC Chambers
    65. DI Chasman
    66. YI Chen
    67. R Chowdhury
    68. C Christensen
    69. AY Chu
    70. M Cocca
    71. FS Collins
    72. JP Cook
    73. J Corley
    74. JC Galbany
    75. AJ Cox
    76. G Cuellar-Partida
    77. J Danesh
    78. G Davies
    79. PI de Bakker
    80. GJ de Borst
    81. S de Denus
    82. MC de Groot
    83. R de Mutsert
    84. IJ Deary
    85. G Dedoussis
    86. EW Demerath
    87. AI den Hollander
    88. JG Dennis
    89. E Di Angelantonio
    90. F Drenos
    91. M Du
    92. AM Dunning
    93. DF Easton
    94. T Ebeling
    95. TL Edwards
    96. PT Ellinor
    97. P Elliott
    98. E Evangelou
    99. AE Farmaki
    100. JD Faul
    101. MF Feitosa
    102. S Feng
    103. E Ferrannini
    104. MM Ferrario
    105. J Ferrieres
    106. JC Florez
    107. I Ford
    108. M Fornage
    109. PW Franks
    110. R Frikke-Schmidt
    111. TE Galesloot
    112. W Gan
    113. I Gandin
    114. P Gasparini
    115. V Giedraitis
    116. A Giri
    117. G Girotto
    118. SD Gordon
    119. P Gordon-Larsen
    120. M Gorski
    121. N Grarup
    122. ML Grove
    123. V Gudnason
    124. S Gustafsson
    125. T Hansen
    126. KM Harris
    127. TB Harris
    128. AT Hattersley
    129. C Hayward
    130. L He
    131. IM Heid
    132. K Heikkilä
    133. Ø Helgeland
    134. J Hernesniemi
    135. AW Hewitt
    136. LJ Hocking
    137. M Hollensted
    138. OL Holmen
    139. GK Hovingh
    140. JM Howson
    141. CB Hoyng
    142. PL Huang
    143. K Hveem
    144. MA Ikram
    145. E Ingelsson
    146. AU Jackson
    147. JH Jansson
    148. GP Jarvik
    149. GB Jensen
    150. MA Jhun
    151. Y Jia
    152. X Jiang
    153. S Johansson
    154. ME Jørgensen
    155. T Jørgensen
    156. P Jousilahti
    157. JW Jukema
    158. B Kahali
    159. RS Kahn
    160. M Kähönen
    161. PR Kamstrup
    162. S Kanoni
    163. J Kaprio
    164. M Karaleftheri
    165. SL Kardia
    166. F Karpe
    167. F Kee
    168. R Keeman
    169. LA Kiemeney
    170. H Kitajima
    171. KB Kluivers
    172. T Kocher
    173. P Komulainen
    174. J Kontto
    175. JS Kooner
    176. C Kooperberg
    177. P Kovacs
    178. J Kriebel
    179. H Kuivaniemi
    180. S Küry
    181. J Kuusisto
    182. M La Bianca
    183. M Laakso
    184. TA Lakka
    185. EM Lange
    186. LA Lange
    187. CD Langefeld
    188. C Langenberg
    189. EB Larson
    190. IT Lee
    191. T Lehtimäki
    192. CE Lewis
    193. H Li
    194. J Li
    195. R Li-Gao
    196. H Lin
    197. LA Lin
    198. X Lin
    199. L Lind
    200. J Lindström
    201. A Linneberg
    202. Y Liu
    203. Y Liu
    204. A Lophatananon
    205. J Luan
    206. SA Lubitz
    207. LP Lyytikäinen
    208. DA Mackey
    209. PA Madden
    210. AK Manning
    211. S Männistö
    212. G Marenne
    213. J Marten
    214. NG Martin
    215. AL Mazul
    216. K Meidtner
    217. A Metspalu
    218. P Mitchell
    219. KL Mohlke
    220. DO Mook-Kanamori
    221. A Morgan
    222. AD Morris
    223. AP Morris
    224. M Müller-Nurasyid
    225. PB Munroe
    226. MA Nalls
    227. M Nauck
    228. CP Nelson
    229. M Neville
    230. SF Nielsen
    231. K Nikus
    232. PR Njølstad
    233. BG Nordestgaard
    234. I Ntalla
    235. JR O'Connel
    236. H Oksa
    237. LM Loohuis
    238. RA Ophoff
    239. KR Owen
    240. CJ Packard
    241. S Padmanabhan
    242. CN Palmer
    243. G Pasterkamp
    244. AP Patel
    245. A Pattie
    246. O Pedersen
    247. PL Peissig
    248. GM Peloso
    249. CE Pennell
    250. M Perola
    251. JA Perry
    252. JR Perry
    253. TN Person
    254. A Pirie
    255. O Polasek
    256. D Posthuma
    257. OT Raitakari
    258. A Rasheed
    259. R Rauramaa
    260. DF Reilly
    261. AP Reiner
    262. F Renström
    263. PM Ridker
    264. JD Rioux
    265. N Robertson
    266. A Robino
    267. O Rolandsson
    268. I Rudan
    269. KS Ruth
    270. D Saleheen
    271. V Salomaa
    272. NJ Samani
    273. K Sandow
    274. Y Sapkota
    275. N Sattar
    276. MK Schmidt
    277. PJ Schreiner
    278. MB Schulze
    279. RA Scott
    280. MP Segura-Lepe
    281. S Shah
    282. X Sim
    283. S Sivapalaratnam
    284. KS Small
    285. AV Smith
    286. JA Smith
    287. L Southam
    288. TD Spector
    289. EK Speliotes
    290. JM Starr
    291. V Steinthorsdottir
    292. HM Stringham
    293. M Stumvoll
    294. P Surendran
    295. LM 't Hart
    296. KE Tansey
    297. JC Tardif
    298. KD Taylor
    299. A Teumer
    300. DJ Thompson
    301. U Thorsteinsdottir
    302. BH Thuesen
    303. A Tönjes
    304. G Tromp
    305. S Trompet
    306. E Tsafantakis
    307. J Tuomilehto
    308. A Tybjaerg-Hansen
    309. JP Tyrer
    310. R Uher
    311. AG Uitterlinden
    312. S Ulivi
    313. SW van der Laan
    314. AR Van Der Leij
    315. CM van Duijn
    316. NM van Schoor
    317. J van Setten
    318. A Varbo
    319. TV Varga
    320. R Varma
    321. DR Edwards
    322. SH Vermeulen
    323. H Vestergaard
    324. V Vitart
    325. TF Vogt
    326. D Vozzi
    327. M Walker
    328. F Wang
    329. CA Wang
    330. S Wang
    331. Y Wang
    332. NJ Wareham
    333. HR Warren
    334. J Wessel
    335. SM Willems
    336. JG Wilson
    337. DR Witte
    338. MO Woods
    339. Y Wu
    340. H Yaghootkar
    341. J Yao
    342. P Yao
    343. LM Yerges-Armstrong
    344. R Young
    345. E Zeggini
    346. X Zhan
    347. W Zhang
    348. JH Zhao
    349. W Zhao
    350. W Zhao
    351. H Zheng
    352. W Zhou
    353. JI Rotter
    354. M Boehnke
    355. S Kathiresan
    356. MI McCarthy
    357. CJ Willer
    358. K Stefansson
    359. IB Borecki
    360. DJ Liu
    361. KE North
    362. NL Heard-Costa
    363. TH Pers
    364. CM Lindgren
    365. C Oxvig
    366. Z Kutalik
    367. F Rivadeneira
    368. RJ Loos
    369. TM Frayling
    370. JN Hirschhorn
    371. P Deloukas
    372. G Lettre
    373. EPIC-InterAct Consortium, CHD Exome+ Consortium, ExomeBP Consortium, T2D-Genes Consortium, GoT2D Genes Consortium, Global Lipids Genetics Consortium, ReproGen Consortium, MAGIC Investigators
    (2017)
    Nature 542:186–190.
    https://doi.org/10.1038/nature21039
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
    Defining the role of common variation in the genomic and biological architecture of adult human height
    1. AR Wood
    2. T Esko
    3. J Yang
    4. S Vedantam
    5. TH Pers
    6. S Gustafsson
    7. AY Chu
    8. K Estrada
    9. J Luan
    10. Z Kutalik
    11. N Amin
    12. ML Buchkovich
    13. DC Croteau-Chonka
    14. FR Day
    15. Y Duan
    16. T Fall
    17. R Fehrmann
    18. T Ferreira
    19. AU Jackson
    20. J Karjalainen
    21. KS Lo
    22. AE Locke
    23. R Mägi
    24. E Mihailov
    25. E Porcu
    26. JC Randall
    27. A Scherag
    28. AA Vinkhuyzen
    29. HJ Westra
    30. TW Winkler
    31. T Workalemahu
    32. JH Zhao
    33. D Absher
    34. E Albrecht
    35. D Anderson
    36. J Baron
    37. M Beekman
    38. A Demirkan
    39. GB Ehret
    40. B Feenstra
    41. MF Feitosa
    42. K Fischer
    43. RM Fraser
    44. A Goel
    45. J Gong
    46. AE Justice
    47. S Kanoni
    48. ME Kleber
    49. K Kristiansson
    50. U Lim
    51. V Lotay
    52. JC Lui
    53. M Mangino
    54. I Mateo Leach
    55. C Medina-Gomez
    56. MA Nalls
    57. DR Nyholt
    58. CD Palmer
    59. D Pasko
    60. S Pechlivanis
    61. I Prokopenko
    62. JS Ried
    63. S Ripke
    64. D Shungin
    65. A Stancáková
    66. RJ Strawbridge
    67. YJ Sung
    68. T Tanaka
    69. A Teumer
    70. S Trompet
    71. SW van der Laan
    72. J van Setten
    73. JV Van Vliet-Ostaptchouk
    74. Z Wang
    75. L Yengo
    76. W Zhang
    77. U Afzal
    78. J Arnlöv
    79. GM Arscott
    80. S Bandinelli
    81. A Barrett
    82. C Bellis
    83. AJ Bennett
    84. C Berne
    85. M Blüher
    86. JL Bolton
    87. Y Böttcher
    88. HA Boyd
    89. M Bruinenberg
    90. BM Buckley
    91. S Buyske
    92. IH Caspersen
    93. PS Chines
    94. R Clarke
    95. S Claudi-Boehm
    96. M Cooper
    97. EW Daw
    98. PA De Jong
    99. J Deelen
    100. G Delgado
    101. JC Denny
    102. R Dhonukshe-Rutten
    103. M Dimitriou
    104. AS Doney
    105. M Dörr
    106. N Eklund
    107. E Eury
    108. L Folkersen
    109. ME Garcia
    110. F Geller
    111. V Giedraitis
    112. AS Go
    113. H Grallert
    114. TB Grammer
    115. J Gräßler
    116. H Grönberg
    117. LC de Groot
    118. CJ Groves
    119. J Haessler
    120. P Hall
    121. T Haller
    122. G Hallmans
    123. A Hannemann
    124. CA Hartman
    125. M Hassinen
    126. C Hayward
    127. NL Heard-Costa
    128. Q Helmer
    129. G Hemani
    130. AK Henders
    131. HL Hillege
    132. MA Hlatky
    133. W Hoffmann
    134. P Hoffmann
    135. O Holmen
    136. JJ Houwing-Duistermaat
    137. T Illig
    138. A Isaacs
    139. AL James
    140. J Jeff
    141. B Johansen
    142. Å Johansson
    143. J Jolley
    144. T Juliusdottir
    145. J Junttila
    146. AN Kho
    147. L Kinnunen
    148. N Klopp
    149. T Kocher
    150. W Kratzer
    151. P Lichtner
    152. L Lind
    153. J Lindström
    154. S Lobbens
    155. M Lorentzon
    156. Y Lu
    157. V Lyssenko
    158. PK Magnusson
    159. A Mahajan
    160. M Maillard
    161. WL McArdle
    162. CA McKenzie
    163. S McLachlan
    164. PJ McLaren
    165. C Menni
    166. S Merger
    167. L Milani
    168. A Moayyeri
    169. KL Monda
    170. MA Morken
    171. G Müller
    172. M Müller-Nurasyid
    173. AW Musk
    174. N Narisu
    175. M Nauck
    176. IM Nolte
    177. MM Nöthen
    178. L Oozageer
    179. S Pilz
    180. NW Rayner
    181. F Renstrom
    182. NR Robertson
    183. LM Rose
    184. R Roussel
    185. S Sanna
    186. H Scharnagl
    187. S Scholtens
    188. FR Schumacher
    189. H Schunkert
    190. RA Scott
    191. J Sehmi
    192. T Seufferlein
    193. J Shi
    194. K Silventoinen
    195. JH Smit
    196. AV Smith
    197. J Smolonska
    198. AV Stanton
    199. K Stirrups
    200. DJ Stott
    201. HM Stringham
    202. J Sundström
    203. MA Swertz
    204. AC Syvänen
    205. BO Tayo
    206. G Thorleifsson
    207. JP Tyrer
    208. S van Dijk
    209. NM van Schoor
    210. N van der Velde
    211. D van Heemst
    212. FV van Oort
    213. SH Vermeulen
    214. N Verweij
    215. JM Vonk
    216. LL Waite
    217. M Waldenberger
    218. R Wennauer
    219. LR Wilkens
    220. C Willenborg
    221. T Wilsgaard
    222. MK Wojczynski
    223. A Wong
    224. AF Wright
    225. Q Zhang
    226. D Arveiler
    227. SJ Bakker
    228. J Beilby
    229. RN Bergman
    230. S Bergmann
    231. R Biffar
    232. J Blangero
    233. DI Boomsma
    234. SR Bornstein
    235. P Bovet
    236. P Brambilla
    237. MJ Brown
    238. H Campbell
    239. MJ Caulfield
    240. A Chakravarti
    241. R Collins
    242. FS Collins
    243. DC Crawford
    244. LA Cupples
    245. J Danesh
    246. U de Faire
    247. HM den Ruijter
    248. R Erbel
    249. J Erdmann
    250. JG Eriksson
    251. M Farrall
    252. E Ferrannini
    253. J Ferrières
    254. I Ford
    255. NG Forouhi
    256. T Forrester
    257. RT Gansevoort
    258. PV Gejman
    259. C Gieger
    260. A Golay
    261. O Gottesman
    262. V Gudnason
    263. U Gyllensten
    264. DW Haas
    265. AS Hall
    266. TB Harris
    267. AT Hattersley
    268. AC Heath
    269. C Hengstenberg
    270. AA Hicks
    271. LA Hindorff
    272. AD Hingorani
    273. A Hofman
    274. GK Hovingh
    275. SE Humphries
    276. SC Hunt
    277. E Hypponen
    278. KB Jacobs
    279. MR Jarvelin
    280. P Jousilahti
    281. AM Jula
    282. J Kaprio
    283. JJ Kastelein
    284. M Kayser
    285. F Kee
    286. SM Keinanen-Kiukaanniemi
    287. LA Kiemeney
    288. JS Kooner
    289. C Kooperberg
    290. S Koskinen
    291. P Kovacs
    292. AT Kraja
    293. M Kumari
    294. J Kuusisto
    295. TA Lakka
    296. C Langenberg
    297. L Le Marchand
    298. T Lehtimäki
    299. S Lupoli
    300. PA Madden
    301. S Männistö
    302. P Manunta
    303. A Marette
    304. TC Matise
    305. B McKnight
    306. T Meitinger
    307. FL Moll
    308. GW Montgomery
    309. AD Morris
    310. AP Morris
    311. JC Murray
    312. M Nelis
    313. C Ohlsson
    314. AJ Oldehinkel
    315. KK Ong
    316. WH Ouwehand
    317. G Pasterkamp
    318. A Peters
    319. PP Pramstaller
    320. JF Price
    321. L Qi
    322. OT Raitakari
    323. T Rankinen
    324. DC Rao
    325. TK Rice
    326. M Ritchie
    327. I Rudan
    328. V Salomaa
    329. NJ Samani
    330. J Saramies
    331. MA Sarzynski
    332. PE Schwarz
    333. S Sebert
    334. P Sever
    335. AR Shuldiner
    336. J Sinisalo
    337. V Steinthorsdottir
    338. RP Stolk
    339. JC Tardif
    340. A Tönjes
    341. A Tremblay
    342. E Tremoli
    343. J Virtamo
    344. MC Vohl
    345. P Amouyel
    346. FW Asselbergs
    347. TL Assimes
    348. M Bochud
    349. BO Boehm
    350. E Boerwinkle
    351. EP Bottinger
    352. C Bouchard
    353. S Cauchi
    354. JC Chambers
    355. SJ Chanock
    356. RS Cooper
    357. PI de Bakker
    358. G Dedoussis
    359. L Ferrucci
    360. PW Franks
    361. P Froguel
    362. LC Groop
    363. CA Haiman
    364. A Hamsten
    365. MG Hayes
    366. J Hui
    367. DJ Hunter
    368. K Hveem
    369. JW Jukema
    370. RC Kaplan
    371. M Kivimaki
    372. D Kuh
    373. M Laakso
    374. Y Liu
    375. NG Martin
    376. W März
    377. M Melbye
    378. S Moebus
    379. PB Munroe
    380. I Njølstad
    381. BA Oostra
    382. CN Palmer
    383. NL Pedersen
    384. M Perola
    385. L Pérusse
    386. U Peters
    387. JE Powell
    388. C Power
    389. T Quertermous
    390. R Rauramaa
    391. E Reinmaa
    392. PM Ridker
    393. F Rivadeneira
    394. JI Rotter
    395. TE Saaristo
    396. D Saleheen
    397. D Schlessinger
    398. PE Slagboom
    399. H Snieder
    400. TD Spector
    401. K Strauch
    402. M Stumvoll
    403. J Tuomilehto
    404. M Uusitupa
    405. P van der Harst
    406. H Völzke
    407. M Walker
    408. NJ Wareham
    409. H Watkins
    410. HE Wichmann
    411. JF Wilson
    412. P Zanen
    413. P Deloukas
    414. IM Heid
    415. CM Lindgren
    416. KL Mohlke
    417. EK Speliotes
    418. U Thorsteinsdottir
    419. I Barroso
    420. CS Fox
    421. KE North
    422. DP Strachan
    423. JS Beckmann
    424. SI Berndt
    425. M Boehnke
    426. IB Borecki
    427. MI McCarthy
    428. A Metspalu
    429. K Stefansson
    430. AG Uitterlinden
    431. CM van Duijn
    432. L Franke
    433. CJ Willer
    434. AL Price
    435. G Lettre
    436. RJ Loos
    437. MN Weedon
    438. E Ingelsson
    439. JR O'Connell
    440. GR Abecasis
    441. DI Chasman
    442. ME Goddard
    443. PM Visscher
    444. JN Hirschhorn
    445. TM Frayling
    446. Electronic Medical Records and Genomics (eMEMERGEGE) Consortium, MIGen Consortium, PAGEGE Consortium, LifeLines Cohort Study
    (2014)
    Nature Genetics 46:1173–1186.
    https://doi.org/10.1038/ng.3097
  43. 43
  44. 44
  45. 45
  46. 46

Decision letter

  1. Christian R Landry
    Reviewing Editor; Université Laval, Canada
  2. Naama Barkai
    Senior Editor; Weizmann Institute of Science, Israel

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

The authors examine the relationship between the frequency of genetic variants in natural populations and their effects on complex growth traits using the budding yeast as a model. They find that high-impact variants tend to be rare and that their effects often combine in a non-additive manner. Their results contribute to a better understanding of phenotypic diversity and will help future developments in the use of natural populations for the mapping of genetic variation underlying complex traits such as those using GWAS in which low-frequency variants represent a particular challenge. Their observations are therefore of interest to a large community of scientists interested in evolution, genetics and particularly in the architecture of complex traits. The data produced and approach developed also represent an important resource for the community.

Decision letter after peer review:

Thank you for submitting your article "Extensive impact of low-frequency variants on the phenotypic landscape at population-scale" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Naama Barkai as the Senior Editor. The reviewers have opted to remain anonymous.

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

Your paper examines the correlation between allele frequencies and their effects on quantitative characters using QTL mapping and the analysis of a large number of genomes. You find that rare variants explain an unexpectedly large proportion of phenotypic variance. Your study is one of the first to examine this association systematically. Overall, the reviewers found the work of interest and to be a potentially important contribution. One major concern that emerged from the reviews and the discussions among the reviewers is that the importance of the work will not be obvious for non-specialists. One reviewer also mentions that similar conclusions could have been obtained from a meta-analysis of the existing literature. Since eLife is a generalist journal, it would be crucial to better articulate why the study is important and how the findings will impact the field of genetics and maybe evolution in general. More theoretical background as to why variants with large impacts on phenotypes should be rare or vice-versa would be useful. The manuscript is currently very short so you have plenty of space to extend on these points in the Introduction and in the Discussion. One reviewer also suggested you extend the analysis and text on the implication of the conditions tested for yeast biology, which I believe would strengthen the paper as well in terms of impact.

I collated below the other comments of the reviewers that are essential points to consider if you want to submit a revised version.

Essential revisions:

1) For a polygenic trait, the distinction between dominance and additivity isn't a relevant one. For example, you could have 100 loci, each is completely dominance, but if they are additive between loci, the hybrid test will appear additive. The latter results by GWAS suggest that a lot of variants have over-dominant effect (at least some over-dominant component). I can see what the authors are trying to do here, i.e., to assess the contribution of additivity versus other non-additive effects, but I think as long as there are many loci and there is some degree of additivity between loci, everything will appear additive. I think the distinction between additive and non-additive effects are only relevant when discussing one locus. If you had a panel of near-isogenic lines, a diallel experiment could answer the question of additivity versus non-additivity. The results from this analysis are still useful and I would suggest the authors simply report the results without invoking the term of additivity versus dominance. Alternatively, clearly state the caveats so readers don't mis-read the interpretation.

2) I have a somewhat different interpretation of the rare versus common comparison. There are a few facts nicely presented. 1) although there are fewer rare variants in the diallel than common ones, rare variants are more likely to be associated with the traits. This is a major finding. 2) On a per variant basis, common and low-frequency variants explain about the same amount of variation. This means the effect size should be larger for rare variants than common variants. I don't think the statistical significance in Figure 4D is worth highlighting, the difference was minimal (20.2% versus 19.6% with a large variance). Power is proportional to variance explained so it's expected that these two groups produce more or less equal variance on a per variant basis if using the same threshold. However, in the diallel, there are way more common variants than rare variants. This means in the diallel, more variance is explained by common variants as a whole. I can see that if rare variants are more likely to be associated with traits, then in an outbred population, they could also be disproportionally associated with traits but more difficult to detect. I would appreciate some discussion on the contribution by a per-variant basis and overall contribution.

3) The main conclusion of the manuscript is that rare variants significantly contribute to genetic variance. In my view, this conclusion is biased as these rare causal variants are being analyzed in genetic backgrounds in which they are no longer rare; actually, these variants are biallelic. Several studies have shown that a rare variant of MKT1(89A) is a significant contributor to phenotypic variation whenever it is present in segregating populations. However, MKT1(89A) allele hardly identified when one of the parents is not S288c, the strain which harbours this allele. So the extension that if the rare variant has a significant effect in a sub-population, its effect size would be similar in a large heterogeneous population is false. Furthermore, the authors conclude that their larger 55 strain population, a representative distribution of 1000 strain collection, most of the variants have additive effects. This the authors claim is revalidation of other previous studies (Bloom et al., 2013, 2015), where they identified most of the causal variants between BYxRM had additive effects. However, subsequent papers (Frosberg et al. 2017, PMID 28250458; Yadav et al. 2016) showed that variance mapping in BYxRM segregants helped to account for genetic interactions and showed how non-additive interactions also contribute significantly to phenotypic variation. One of the results in the manuscript that non-additive effects contribute 1/3rd to phenotypic variance indicates that additive effects do not explain all effects with dominance, a non-additive interaction, being a significant contributor. Also, the authors fail to explain why dominance is so frequently observed in their diallelic panel. A possible reason could be that one variant is selected for a trait better than the other, and in combination with a weaker or neutral allele, it shows dominance.

4) I find that just doing a few more strains does not make this manuscript a significant advance over the previous studies. One can argue that taking into account all causal variants identified to date (Fay, 2013), one can identify what frequency of rare variants have been identified, e.g. a typical example being MKT1(89A) allele as causal, even though their effect size will not be identified using this strategy. Peltier et al., 2019, show that 284 rare QTNs variants have been identified to date and these functional variants being private to a subpopulation, possibly due to their adaptive role to a specific environment. Moreover, this conclusion can be made without these extensive experimental crosses.

https://doi.org/10.7554/eLife.49258.020

Author response

Your paper examines the correlation between allele frequencies and their effects on quantitative characters using QTL mapping and the analysis of a large number of genomes. You find that rare variants explain an unexpectedly large proportion of phenotypic variance. Your study is one of the first to examine this association systematically. Overall, the reviewers found the work of interest and to be a potentially important contribution. One major concern that emerged from the reviews and the discussions among the reviewers is that the importance of the work will not be obvious for non-specialists. One reviewer also mentions that similar conclusions could have been obtained from a meta-analysis of the existing literature.

We performed such an analysis in the framework of the 1002 Yeast Genomes Project and this analysis was mentioned in the first version of the manuscript. More recently, we were involved in a larger analysis but this one was not cited (Peltier et al., 2019) because unpublished at that time. Now, a proper citation has been included and we commented on this specific point in the Discussion.

Even if such analyses are really insightful, we really think that there are some biases in the subset of detected QTNs in yeast using linkage mapping for different reasons: First in terms of genetic backgrounds studied as most of linkage mapping studies were performed on mostly the same set of isolates. Second, experimentally validated QTNs are often prioritized based on their effect size.

Our study allows for a more global and quantitative approach as the variants are taken from a representative, genetically diverse and larger population. The subset of genetic variants is also much larger. Overall, this dataset gives a precise as well as a quantitative global view of the role of low-frequency variants on the phenotypic diversity in a population.

Since eLife is a generalist journal, it would be crucial to better articulate why the study is important and how the findings will impact the field of genetics and maybe evolution in general. More theoretical background as to why variants with large impacts on phenotypes should be rare or vice-versa would be useful. The manuscript is currently very short so you have plenty of space to extend on these points in the Introduction and in the Discussion.

As suggested, we modified the Introduction by adding more background on the missing heritability problem as well as on the role of low-frequency and rare variants in human diseases. We also expanded the Discussion in order to answer to several points raised during the reviewing process (see below).

One reviewer also suggested you extend the analysis and text on the implication of the conditions tested for yeast biology, which I believe would strengthen the paper as well in terms of impact.

The goal of our study was to have a myriad of complex traits to study. Consequently we selected a large number of conditions for which the phenotypic variance was broad in our population. These conditions were already tested in the framework of the 1002 Yeast Genomes Project (Peter et al., 2018). Most of them show a normal distribution, meaning that they correspond to complex traits. A good dissection and analysis of the implication of the tested conditions for yeast biology require an additional step, namely the determination of inheritance patterns in the progeny. This is actually something that is intended as a logical follow-up to this study.

Essential revisions:

1) For a polygenic trait, the distinction between dominance and additivity isn't a relevant one. For example, you could have 100 loci, each is completely dominance, but if they are additive between loci, the hybrid test will appear additive. The latter results by GWAS suggest that a lot of variants have over-dominant effect (at least some over-dominant component). I can see what the authors are trying to do here, i.e., to assess the contribution of additivity versus other non-additive effects, but I think as long as there are many loci and there is some degree of additivity between loci, everything will appear additive. I think the distinction between additive and non-additive effects are only relevant when discussing one locus. If you had a panel of near-isogenic lines, a diallel experiment could answer the question of additivity versus non-additivity. The results from this analysis are still useful and I would suggest the authors simply report the results without invoking the term of additivity versus dominance. Alternatively, clearly state the caveats so readers don't mis-read the interpretation.

As we only look at the final phenotype of the hybrid, we do agree that the distinction of additivity vs. dominance is only the result of all the combined effects of the genes and that no distinction between the effect of individual loci can be done. However, one can argue that if dominance is indeed detected as the main mode of inheritance, it might suggest the presence of a locus of high phenotypic impact acting dominantly. Also it is possible that if two hybrids display complete dominance towards a parent, it does not necessarily reflect that the same locus is involved in both cases. As suggested, we clearly stated the caveats and consequently we added a paragraph in the Discussion to clarify this point.

2) I have a somewhat different interpretation of the rare versus common comparison. There are a few facts nicely presented.

1) although there are fewer rare variants in the diallel than common ones, rare variants are more likely to be associated with the traits. This is a major finding.

We thank the reviewer for this comment. It is, indeed, true that low-frequency variants are disproportionally associated to the trait (i.e. they are overrepresented) and we now emphasized more on that point in the Abstract and the Results section.

2) On a per variant basis, common and low frequency variants explain about the same amount of variation. This means the effect size should be larger for rare variants than common variants. I don't think the statistical significance in Figure 4D is worth highlighting, the difference was minimal (20.2% versus 19.6% with a large variance). Power is proportional to variance explained so it's expected that these two groups produce more or less equal variance on a per variant basis if using the same threshold. However, in the diallel, there are way more common variants than rare variants. This means in the diallel, more variance is explained by common variants as a whole. I can see that if rare variants are more likely to be associated with traits, then in an outbred population, they could also be disproportionally associated with traits but more difficult to detect. I would appreciate some discussion on the contribution by a per-variant basis and overall contribution.

We thank the reviewer for these comments. This is only true if we look at it in the same population. However, here, in our diallel panel, the low-frequency variants in the initial population are no longer rare because of a shift of the allele frequency. For example, a variant having a MAF of 3% in the 1,011 can rise to 25% in the diallel. Thus, the fraction explained in the diallel won’t be linked to the MAF in the initial population.

To answer this issue, we computed the effect size of the significantly associated variants. Effect size is a metric that is independent of allele frequency thus making it more prone to extrapolation in a different population. We added a paragraph about this point in the Results section as well as a figure (Figure 3E), and in the Discussion.

Concerning the fraction explain by common and low frequency associated SNPs, we do agree that the difference is minimal. As suggested, we did not highlight that point in the new version anymore.

3) The main conclusion of the manuscript is that rare variants significantly contribute to genetic variance. In my view, this conclusion is biased as these rare causal variants are being analyzed in genetic backgrounds in which they are no longer rare; actually, these variants are biallelic. Several studies have shown that a rare variant of MKT1(89A) is a significant contributor to phenotypic variation whenever it is present in segregating populations. However, MKT1(89A) allele hardly identified when one of the parents is not S288c, the strain which harbours this allele. So the extension that if the rare variant has a significant effect in a sub-population, its effect size would be similar in a large heterogeneous population is false.

This part is related to what we mentioned previously. Indeed, the effect size of this variant would be roughly the same in a different population, however it is true that the fraction of variance explained by such a variant could be different. Consequently, we computed the effect size of the significantly associated variants and we’ve shown that effect size of low-frequency variants is not much different from common variants.

Furthermore, the authors conclude that their larger 55 strain population, a representative distribution of 1000 strain collection, most of the variants have additive effects. This the authors claim is revalidation of other previous studies (Bloom et al., 2013, 2015), where they identified most of the causal variants between BYxRM had additive effects. However, subsequent papers (Frosberg et al. 2017, PMID 28250458; Yadav et al. 2016) showed that variance mapping in BYxRM segregants helped to account for genetic interactions and showed how non-additive interactions also contribute significantly to phenotypic variation. One of the results in the manuscript that non-additive effects contribute 1/3rd to phenotypic variance indicates that additive effects do not explain all effects with dominance, a non-additive interaction, being a significant contributor. Also, the authors fail to explain why dominance is so frequently observed in their diallelic panel. A possible reason could be that one variant is selected for a trait better than the other, and in combination with a weaker or neutral allele, it shows dominance.

As suggested, we added the references in the text. One hypothesis that could be proposed to explain the importance of dominance in our dataset is the presence of genetic variants with strong phenotypic effect acting dominantly in some strains and being responsible for most of the phenotypic variance in all crosses being heterozygous at this particular locus. We now added this point in the Discussion section.

4) I find that just doing a few more strains does not make this manuscript a significant advance over the previous studies. One can argue that taking into account all causal variants identified to date (Fay, 2013), one can identify what frequency of rare variants have been identified, e.g. a typical example being MKT1(89A) allele as causal, even though their effect size will not be identified using this strategy. Peltier et al., 2019, show that 284 rare QTNs variants have been identified to date and these functional variants being private to a subpopulation, possibly due to their adaptive role to a specific environment. Moreover, this conclusion can be made without these extensive experimental crosses.

As already discussed above, we strongly believe that our study corresponds to a more global and systematic approach than the concatenation of different results from different linkage mapping studies. We exhaustively looked and compared the fraction of variance explained and the effect size from variants of a large dataset of associated genetic variants, which were not chosen based on their effect size.

https://doi.org/10.7554/eLife.49258.021

Article and author information

Author details

  1. Téo Fournier

    Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
    Contribution
    Conceptualization, Resources, Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4860-6728
  2. Omar Abou Saada

    Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
    Contribution
    Software, Formal analysis, Writing—review and editing
    Competing interests
    No competing interests declared
  3. Jing Hou

    Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
    Contribution
    Conceptualization, Software, Formal analysis, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared
  4. Jackson Peter

    Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
    Contribution
    Software, Formal analysis
    Competing interests
    No competing interests declared
  5. Elodie Caudal

    Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
    Contribution
    Resources, Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  6. Joseph Schacherer

    Université de Strasbourg, CNRS, GMGM UMR 7156, Strasbourg, France
    Contribution
    Conceptualization, Supervision, Funding acquisition, Validation, Methodology, Writing—original draft, Project administration, Writing—review and editing
    For correspondence
    schacherer@unistra.fr
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-6606-6884

Funding

National Institutes of Health (R01 GM101091-01)

  • Joseph Schacherer

European Research Council (Consolidator grants (772505))

  • Joseph Schacherer

Fondation pour la Recherche Médicale (Graduate student grant)

  • Téo Fournier

Institut Universitaire de France

  • Joseph Schacherer

University of Strasbourg Institute for Advanced Study

  • Joseph Schacherer

Ministère de l’Enseignement Supérieur et de la Recherche

  • Téo Fournier

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Joshua Bloom and Leonid Kruglyak for insightful discussions, comments on the manuscript as well as for sharing their unpublished manuscript. We thank Maitreya Dunham and the members of the Schacherer laboratory for comments and suggestions. We also thank Gilles Fischer for providing the pAEF5 plasmid. This work was supported by a National Institutes of Health (NIH) grant R01 (GM101091-01) and a European Research Council (ERC) Consolidator grant (772505). TF is supported in part by a grant from the Ministère de l’Enseignement Supérieur et de la Recherche and in part by a fellowship from the medical association la Fondation pour la Recherche Médicale. JS is a Fellow of the University of Strasbourg Institute for Advanced Study (USIAS) and a member of the Institut Universitaire de France.

Senior Editor

  1. Naama Barkai, Weizmann Institute of Science, Israel

Reviewing Editor

  1. Christian R Landry, Université Laval, Canada

Publication history

  1. Received: June 12, 2019
  2. Accepted: October 23, 2019
  3. Accepted Manuscript published: October 24, 2019 (version 1)
  4. Version of Record published: December 4, 2019 (version 2)

Copyright

© 2019, Fournier et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,181
    Page views
  • 200
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Genetics and Genomics
    Luisa F Pallares
    Insight

    Rare genetic variants in yeast explain a large amount of phenotypic variation in a complex trait like growth.

    1. Genetics and Genomics
    2. Stem Cells and Regenerative Medicine
    Lucas D Sanor et al.
    Research Advance