Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorJeffrey Ross-IbarraUniversity of California, Davis, Davis, United States of America
- Senior EditorGeorge PerryPennsylvania State University, University Park, United States of America
Reviewer #1 (Public Review):
Summary:
In this study, the authors attempt to reinvestigate an old question in population genetics regarding the age of alleles that have experienced different strengths (and directions) of natural selection. Under simple population genetic models, alleles that are positively selected are expected to change frequency in populations faster than neutral alleles. So the naïve expectation is that if you look at alleles that are the same population frequency, those that have been evolving neutrally should have been segregating in the population longer than those that have been experiencing natural selection. While this is exactly what the authors find for alleles inferred to be experiencing negative selection (i.e. they tend to be younger than alleles inferred to be neutral that are at the same frequency), the authors find the opposite for alleles inferred to be under positive selection: they tend to be older than alleles inferred to be neutral. The authors argue that this pattern can be explained by a model where positively selected mutations experience a phase of balancing selection that can dramatically extend the period of time that these alleles segregate in the population.
Strengths:
The question that the authors address is very interesting and thought provoking. When confronted with a counter-intuitive finding, the authors describe an interesting hypothesis to explain it. The authors investigate a number of interesting sub analyses to corroborate their findings.
Weaknesses:
While there are some intriguing hypotheses in this manuscript, I struggle to be convinced. The main point that the authors argue is that positively selected alleles are older than their neutral counterparts at the same frequency. They argue that this may be because the positively selected alleles are stuck in some form of balancing selection for a long time before they switch to a more classical form of directional selection. The form of balancing selection they argue is one caused by linkage to deleterious alleles, which takes time for the beneficial alleles to recombine onto a more neutral background. I would really like to see some simulations that demonstrate this can actually occur on average. Reading this paper brought back memories of the classic Birky and Walsh (1988; PMCID: PMC281982) paper that argued that linkage amongst selected alleles does not impact the substitution rate of linked neutral alleles, but does reduce the substitution rate among beneficial alleles. Their simple simulations in 1988 illuminated how this works, and they developed a simple mathematical model that helped us understand how it works. In the current paper, it seems the authors are arguing for a similar effect, but rather than focus on beneficial alleles that fix, they are focusing on beneficial alleles that are still segregating. These seem like similar stories, but without simulations or a mathematical model, I struggle to gain any insight into why the observation is the way it is (and not simply due to a number of possible confounding effects noted below).
There are a number of elements to the methods and interpretation that could use clarification.
• Genetic data. One of the biggest weaknesses of this analysis is the choice of genetic data. The authors use the UK10k dataset, and reference the 2015 paper. Looking at that paper, it seems that the data may be composed of low coverage whole genome sequencing data (7x) and high coverage exome sequence data (80x). It appears that these data were integrated into a single VCF file, similar to the 1000 Genomes Project Phase 3 data. If these are the data that was used, then there are substantial differences between the coding and non-coding variants that are compared. However, it is possible that the authors chose to restrict the analysis to the low coverage WGS data and neglected to indicate it in the methods section. I will assume that this is the case for the rest of the review, but the authors should clarify.
• Recombination rates. I believe the authors use an LD-based recombination map. While these maps are correlated at the longer physical distances with pedigree maps, there are substantial differences at shorter physical scales. These differences have been argued to be due to the action of natural selection skewing patterns of LD. If that is the case, then some of the observations in this paper are circular. Please confirm similar findings with a pedigree-based recombination map.
• Recombination rates, pt 2. The authors compare patterns of non-synonymous coding variants to a set of non-coding, non-regulatory SNPs. They argue "these will necessarily have experienced similar mutational and recombinational processes". I don't know that this is true. There are both distinct recombination patterns and mutational patterns in genes vs non-coding regions of the genome. It would be important to more carefully match coding and non-coding variants based on both recombination as well as the type of nucleotide change. There are substantial differences in CpG composition in coding vs non-coding regions for example. While the authors say "Analyses thought to be sensitive to CpG high mutability were limited to SNPs that did not occur as part of a CpG", it is quite unclear what where CpGs were included vs excluded.
• Identifying ancestral vs derived alleles. It is unclear how the authors identified ancestral vs derived alleles (they say "inferred ancestral sequence from Ensembl (1) and a maximum likelihood estimator". Several studies have shown that ancestral misidentification can cause skews in the site frequency spectrum. If the ancestral state of some fraction of alleles were misidentified, then the estimated allele age would be incorrect. Figure 1B shows that the mean frequency of the alleles with the largest delta-EP tend to be very low. This makes me think that ancestral misidentification may have impacted the results.
• Figure 2B and C. I do not understand how the median can be so far outside the mean and error bars. The legend does not specify what the error bars are, but I feel the distribution must be shown if it is so skewed that the mean and any definition of error does not include the median.
• Inferring allele ages. The authors use two methods for estimating allele ages, but focus on GEVA. They use the default parameter of effective population size 10,000. How sensitive is the model to this assumption? It has been shown that different regions of the genome (particularly coding vs neutral non-coding) experience different rates of deleterious mutations, and therefore different rates of background selection. Simple models of background selection would suggest that these regions will therefore have different effective population sizes.
• Fst analysis. The authors look at Fst among 3 populations as a function of delta-EP compared to frequency-matched control SNPs. They find there is no statistical support for different levels of Fst in any pairwise comparison for any delta-EP bin. It seems strange that alleles with large delta-EP would not show increased Fst compared to control SNPs... If they are indeed positively selected, the assumption must be that they are then positively selected in all populations, which seems unlikely. Alternatively, by considering only narrow allele frequency bins, it is possible that Fst is also being controlled, and therefore this analysis is non-informative. A simulation would help understand what the expected pattern is here.
• It would be great to show more figures like 2A. You can place the x-axis on a log-scale so that it is easier to view the lower allele frequencies. This plot clearly shows differences among the 3 categories. I am very surprised at the much shorter error bars for negative delta-EP at high frequency compared to positive delta-EP variants... Shouldn't there be very few negative delta-EP alleles at such high frequency?
Reviewer #2 (Public Review):
The authors provide an analysis showing that the allele ages of putatively advantageous alleles tend to be older than those of neutral alleles. To do this, the authors first classify mutations as either neutral, advantageous or deleterious based on a metric called the 'evolutionary probability' which is correlated to the impact of selection acting on a mutation. Then, the authors quantify the age of the mutations using the GEVA method and they also quantify tc (the time of the ancestral node of the edge carrying the mutation). Interestingly, the authors find that advantageous mutations tend to have an older allele age and an older value of tc compared to neutral mutations. The authors posit some explanations for this result invoking the action of balancing selection.
This is an interesting paper and its results could merit an important change in our conception of how we believe that natural selection is acting on the human genome. I have concerns about some of the analysis presented on this paper that have to do with two main factors: 1) Showing that the estimates of allele ages and tc are robust on the dataset presented (more on this topic here below). 2) Presenting more simulations or analytical theory where the authors can show that the models presented by the authors to explain the results indeed fit the data well. As an example, the authors could perform some simulations (likely using SLiM) under the balancing selection models posited by the authors and then show that they can produce data where the allele ages for deleterious, neutral and advantageous alleles have similar patterns to what is observed on the genomic dataset analyzed.
Major concerns
- What is the impact of multiple mutations on the same site on the estimates of allele ages with GEVA?
- GEVA, which is one of the methods used by the authors, 'overestimates "intermediate" times and underestimates older times' according to Ragsdale and Thornton (2023) MBE. What is the impact of this effect for the analysis performed by the authors? Do RUNTC has any known biases on their estimate of tc?
- Additionally what is the impact of phasing errors on the estimates of allele age presented by the authors?
Reviewer #3 (Public Review):
In their manuscript, Pivirotto et al. make an unexpected observation that a set of candidate beneficial alleles according to the Evolutionary Probability method (EP) have estimated ages thousands of years older than control alleles of similar frequency and outside of functional segments. To explain this unexpectedly older ages, the authors propose a number of interesting evolutionary processes related to balancing selection, including staggered sweeps.
It is important to first mention that the authors do find that as expected, deleterious alleles are younger than controls. This provides evidence that the allele age estimates used by the authors are of sufficient quality to detect age differences between groups of genes. I am also convinced by the fact that EP can be used to focus on a set of alleles substantially enriched in deleterious ones, given the very clear frequency patterns related to EP.
I have a number of concerns about the manuscript, including one rather serious one.
My main concern is that many of the observations made by the authors could be caused by mispolarization of alleles, where either (i) mostly low frequency derived alleles are mischaracterized as ancestral and the other, actually ancestral allele is mischaracterized as a high frequency derived allele, or (ii) mostly low frequency ancestral alleles are mischaracterized as derived. Unfortunately, the authors do not even mention the risk of mispolarization in their manuscript. This is a serious problem for this manuscript because ancestral alleles annotated as derived are by definition going to generate older age estimates than if they were truly derived. It would be very useful to be able to have a look at the full distribution of allele ages rather than just confidence intervals as in Figure 1. I happen to have experience with mispolarization of high frequency ancestral alleles as derived by a maximum likelihood method, different from the one used by the authors (Keightley et al Genetics 2018), where the mispolarization became visible as a very suspicious SFS with a visible excess of high frequency variants, especially those expected to be functional (because of the relatively larger corresponding supply of low frequency deleterious functional variants). Even if the ML method used by the authors is not the same, mispolarization is still a serious risk. Glémin et al. Genome Research 2015 also found that mispolarization is far from being a negligible issue.
Mispolarization of low frequency alleles may be especially prominent in the case of mispolarized deleterious alleles associated with a very negative delta-EP, that then appear as alleles with a very positive delta-EP. Focusing on high delta-EP alleles may then in fact enrich the dataset in mispolarized alleles that then result in older age estimates. Looking at Figure 1B especially, I am worried by the fact that very high delta-EP values seem to go back to the frequencies observed for very negative delta-EP. This is what mispolarization of low frequency alleles might cause as a pattern, in this case especially low frequency ancestral alleles being misidentified as derived?
The authors can address the possible issue of mispolarization in multiple ways. First, they can use simulations of sequences to estimate amounts of mispolarization based on their polarization approach, using substitutions/mutation rates as realistic as possible.
Second, the authors could check if there is suspicious symmetry in the distribution of delta-EP between alleles at frequency f and alleles at frequency 1-f. This pattern could be generated by mispolarization.
My second less serious concern has to do with the use of high delta-EP as evidence that alleles are beneficial. The validation set from the Patel & Kumar 2019 paper is arguably small with 24 known selected variants. It does not follow from the fact that a small set of known selected variants have higher delta-EP, that all variants with high delta-EP tend to be beneficial. This is especially true in the case where beneficial variants tend to be rare, and there are then far more variants expected with high delta-EP than there are beneficial variants. I am willing to change my mind on this if the overall results can be shown to be robust after accounting for allele mispolarization.
Third, I like the idea of staggered sweeps to explain the results, but I am wondering if there is any evidence in the literature of interference between deleterious and advantageous variants that the authors could base their proposed explanation on.
Finally, and I realize that it is a bit of a stretch, I am wondering if the authors could better justify their choices of methods to estimate the age of alleles. What about ARGweaver, Relate or tsdate? How do these methods compare with GEVA? From looking at the literature I could not find a direct comparison of the precision of GEVA compared to these other tools, but it may be worth at least discussing that the results could be further put to the test with other available ARG-based tools to estimate allele ages. Wilder Wohns et al. Science 2022 compare the performance of these different ARG methods with ancient DNA data, and in fact find that GEVA does not perform as well as for example Relate or tsdate.