Selection and the direction of phenotypic evolution

  1. François Mallard  Is a corresponding author
  2. Bruno Afonso
  3. Henrique Teotónio  Is a corresponding author
  1. Institut de Biologie de l’École Normale Supérieure, CNRS UMR 8197, Inserm U1024, PSL Research University, France

Abstract

Predicting adaptive phenotypic evolution depends on invariable selection gradients and on the stability of the genetic covariances between the component traits of the multivariate phenotype. We describe the evolution of six traits of locomotion behavior and body size in the nematode Caenorhabditis elegans for 50 generations of adaptation to a novel environment. We show that the direction of adaptive multivariate phenotypic evolution can be predicted from the ancestral selection differentials, particularly when the traits were measured in the new environment. Interestingly, the evolution of individual traits does not always occur in the direction of selection, nor are trait responses to selection always homogeneous among replicate populations. These observations are explained because the phenotypic dimension with most of the ancestral standing genetic variation only partially aligns with the phenotypic dimension under directional selection. These findings validate selection theory and suggest that the direction of multivariate adaptive phenotypic evolution is predictable for tens of generations.

Editor's evaluation

This is an important paper that takes advantage of a comprehensive evolutionary genetic dataset to tease apart the relationship between genetic variation, selection, and phenotypic divergence over 50 generations. The evidence supporting the conclusions is robust and aligns with a growing body of work that shows patterns of variation can predict divergence over long periods of time and also that evolution does not always occur in the direction of selection, particularly when selection is acting on genetically correlated traits. The questions addressed in this study will particularly appeal to evolutionary biologists and quantitative geneticists.

https://doi.org/10.7554/eLife.80993.sa0

Introduction

Predicting adaptive phenotypic evolution is an important research goal in evolutionary biology, agronomy, or in conservation policy (Arnold, 2014; Nosil et al., 2020; Wortel et al., 2023). It is generally accepted that predicting adaptive phenotypic evolution should be done in the context of the whole organism because organisms are not mere collections of genetically or environmentally independent traits (Gould and Lewontin, 1979; Wagner, 2001). Many traits in natural populations are heritable across generations and under natural selection (Walsh and Lynch, 2018). Often, however, phenotypic evolution is not observed (Merilä et al., 2001; Pujol et al., 2018), or is of opposite direction than predicted, because of environmental or genetic correlations between the traits of interest with unmeasured traits (Morrissey et al., 2012; Hajduk et al., 2020).

Current approaches to predict phenotypic evolution during adaptation to novel environments, at least for infinite sexual populations and under infinitesimal assumptions of trait inheritance (Barton et al., 2017), rely on Simpson’s adaptive landscape idea and its formalization by R. Lande: Δz¯=Gβ (Simpson, 1944; Lande, 1979; Arnold et al., 2001). In Lande’s equation, the adaptive evolution of multiple traits’ means over one generation (the column vector Δz¯) is modeled as a function of the G-matrix, which summarizes the genetic structure and heritable transmission of traits from parents to offspring, with the additive genetic variances of traits being represented in the diagonal entries and the additive genetic covariances between them in the off-diagonal entries. The direction and magnitude of phenotypic evolution depend on the size, due to genetic variances, and shape, due to genetic covariances, of the G-matrix. In particular, trait combinations with more genetic variation (henceforth, called canonical traits), allow for faster and more directed responses to selection (Fisher, 1930; Lande, 1980; Schluter, 1996; Blows and McGuigan, 2015). In Lande’s equation, phenotypic evolution is also modeled as a function of the selection gradients on each trait (the vector β), gradients that describe the strength of directional selection on each trait in populations that can be under stabilizing/disruptive selection on each trait and under correlated selection between traits (Lande and Arnold, 1983). Correlated selection determines the shape of the selection surface: when traits with genetic variation are not aligned with the directional selection gradients, phenotypic evolution will be slower and distorted, resulting in less direct responses (Phillips and Arnold, 1989; Arnold et al., 2001; Svensson et al., 2021).

Lande’s equation predicts adaptive phenotypic evolution but might fail when indirect selection is important. Indirect selection results from unmeasured traits being genetically correlated with the measured traits or when there is correlated selection between measured and unmeasured traits (Lande and Arnold, 1983; Rausher, 1992). Using Robertson’s Secondary Theorem of Natural Selection (STNS) (Robertson, 1968; Walsh and Lynch, 2018), Δz¯=s=σg(z,w), trait changes over one episode of selection are accurately predicted because the selection differentials (s) equal the (additive) genetic covariance (σg) of the trait with relative fitness (w) and thus the trait’s breeding value change in one generation, regardless of unmeasured traits (Morrissey et al., 2010; Morrissey and Bonnet, 2019). However, distinguishing direct from indirect selection is not possible with Robertson’s STNS, which led Stinchcombe et al., 2014 to propose combining its merits with those of Lande’s equation in a single statistical framework to predict adaptive phenotypic evolution, see also Etterson and Shaw, 2001; Morrissey et al., 2012; Hajduk et al., 2020. Using Lande’s equation retrospectively, one can estimate ‘genetic’ selection gradients as βg=G-1s=G-1(z¯a - z¯g), where the selection differentials are defined by the difference between the trait measured in an ancestral population (z¯a), and the trait of a diverging population (z¯g) as predicted by the STNS.

Unfortunately, using Lande’s equation with the genetic selection gradients to predict adaptive phenotypic evolution over several generations depends not only on invariable selection gradients but also on the stability of the G-matrix. The input of genetic covariances by pleiotropic mutations could in the long-term of mutation-selection balance (time being scaled by the effective population size) be aligned with correlated selection and eventually explain phenotypic differentiation among populations and species (Lande, 1980; Jones et al., 2007; Jones et al., 2014; Chebib and Guillaume, 2017; Houle et al., 2017; Farhadifar et al., 2015; Svensson et al., 2021). However, many studies find more standing genetic variation in natural populations than expected at mutation-selection balance (Walsh and Lynch, 2018; Sella and Barton, 2019). In part, this is because in the initial stages of adaptation selection might not be weak relative to recombination as required by theory (Lande, 1980; Nagylaki, 1992; Turelli and Barton, 1994), in part this is because selection is not constant or uniform in temporally changing or spatially heterogeneous environments (Gomulkiewicz and Houle, 2009; Chevin et al., 2010; de Villemereuil et al., 2020; Walter, 2023). In addition, the G-matrix is bound to evolve in the short-term because of selection (Cheverud, 1996; Barton and Turelli, 1987; Turelli, 1988; Shaw et al., 1995), although there is mixed empirical evidence that the G-matrix can evolve to align with the orientations of selection (Steppan et al., 2002; Arnold et al., 2008; Chenoweth et al., 2010; Ramakers et al., 2018; Johansson et al., 2021). The reduction in the size of the G-matrix due to drift can be predicted because it is inversely proportional to the effective population size (Lande, 1976; Lynch and Hill, 1986; Barton et al., 2017), but the shape of the G-matrix will change unpredictably because all populations are finite, and bottlenecks and founder effects are common (Phillips et al., 2001; Phillips and McGuigan, 2006). Hence, besides selection, drift might also impact ongoing phenotypic evolution long before the mutation-selection balance is reached (Whitlock et al., 2002; Mallard et al., 2022).

Here, we ask if Lande’s equation with the genetic selection gradients predicts the direction of adaptive phenotypic evolution for more than one generation. By adaptive phenotypic evolution we mean that multivariate trait responses to indirect or direct selection are correlated with adaptation to a novel and fixed target environment. We focus on the locomotion behavior and body size of the hermaphroditic nematode Caenorhabditis elegans in replicate populations gradually evolving to a high salt (NaCl) concentration in their growth-media for 35 generations and then kept in the constant high salt environment for an extra 15 generations. Our ancestral population was domesticated to constant low salt conditions for 140 generations and contains abundant but neutral standing genetic variation for locomotion behavior (Mallard et al., 2022). Although replicate populations were maintained at high population sizes, our experimental regime exacerbates drift and inbreeding because of a slow rate of environmental change until reaching the target high salt environment (Guzella et al., 2018), and because high salt favors hermaphrodite self-fertilization (Theologidis et al., 2014).

Osmotic pressure from high salt concentration in the growth media shrinks individual body size because of water cell loss (Urso and Lamitina, 2021). For the ancestral population, and in our laboratory conditions, high salt also lowers embryo to adult survival and retards growth until maturity (Theologidis et al., 2014). As hermaphrodites cannot mate with each other, delayed male development results in hermaphrodites reproducing mostly by self-fertilization. During domestication, movement was reduced from that observed among wild isolates (Mallard et al., 2022), and hermaphrodites can further reduce movement in high salt during experimental evolution as males become less frequent and cannot harass them (Barr et al., 2018; Cutter et al., 2019). We further know that the canonical trait of locomotion behavior with the most mutational variance in low salt conditions differs from that with the most standing genetic variation found after domestication (Mallard et al., 2023). Consistent with these observations, several studies have shown that C. elegans mutants insensitive to high salt have specific defects in backward or forward movement, in some of these mutants independently of body size effects (Fujiwara et al., 2002; Swierczek et al., 2011; Zhen and Samuel, 2015). On the other hand, movement can increase during experimental evolution due to more foraging and dwelling, as the bacterial food is not as dense in high salt (Gray et al., 2005). Both foraging, dwelling and mate interactions in C. elegans can be described as a complex collection of distinct behavioral states, which vary in the duration of activity and movement direction (Flavell et al., 2020). All these considerations suggest that multiple traits in locomotion behavior can respond to selection but that it is difficult to a priori define which ones are genetically or environmentally independent. For this reason, we mathematically define individual locomotion behavior in 1-dimensional space by six traits, the six transition rates between stillness, moving forward, and moving backward (Mallard et al., 2022, Mallard et al., 2023). Body size is also analyzed as a seventh trait.

In what follows, we ask whether the ancestral phenotypic plasticity between high and low salt environments is aligned with the ancestral G-matrix. We use selection differentials on the seven traits in low salt and high salt environments to predict phenotypic evolution by describing the phenotypic and genetic divergence in high salt. Using Lande’s retrospective equation, we ask if the genetic selection gradients measured in the ancestral population match the phenotypic selection gradients.

Results

Experimental design and analyses

The ancestral population for experimental evolution (A6140) was ultimately derived from a hybrid population of 16 isolates and domestication to a standard laboratory environment in low salt (25 mM NaCl) growth-media conditions for 140 generations (Teotonio et al., 2012; Noble et al., 2017). GA[1,2,4] replicate populations were derived from A6140, with limited founder effects, and independently exposed for 35 generations to a gradual change in salt concentration (8 mM increase each generation) and then kept in constant high salt (305 mM NaCl) for 15 generations. During the experiment, replicate populations were maintained at high population sizes (N=104), and from generation 35 onwards, hermaphrodite self-fertilization became predominant (Theologidis et al., 2014). Using genomic data, effective population sizes have been estimated at Ne=103 in the domestication low salt environment and under partial selfing (Chelo and Teotónio, 2013).

From the ancestral population (A6140), and the three replicate populations at generation 50 (GA[1,2,4]50), inbred lines were derived by self-fertilization of hermaphrodites (Noble et al., 2017; Chelo et al., 2019). Inbred lines were measured for hermaphrodite locomotion behavior and body size at the usual reproduction time of experimental evolution in low and high salt (186, 61, 61, and 42 lines from the ancestral and evolved populations, respectively, with most lines being phenotyped twice; see Methods for details). Six traits defined locomotion behavior: the transition rates between movement states, stillness, forward, and backward (Mallard et al., 2022). For the inbred lines of the ancestral population, we also use self-fertility data at the usual reproduction time of experimental evolution in high salt, as previously reported by Chelo et al., 2019. Finally, we measure the extent of adaptation to high salt conditions using the outbred experimental populations from which the inbred lines were derived. Assays were designed so that grandmaternal and maternal environmental effects were the same for all the samples being compared.

With this data (Figure 1—source data 1, Figure 1—source data 2), we model phenotypic plasticity (mean population differences between environments) and standing genetic variation for locomotion behavior and body size in the ancestral population, the evolution of locomotion behavior and body size, and G-matrix evolution in the three replicate populations at generation 50. We estimated phenotypic plasticity and phenotypic differentiation in a multivariate MANOVA model and compared it with a univariate response model similarly defined (see Methods). The MANOVA allows us to test for ancestral phenotypic plasticity and phenotypic divergence while accounting for potentially correlated trait variation. The univariate approach allows us to estimate the inbred lines trait values and to test for the phenotypic divergence of each replicate population relative to the ancestral population but does not account for correlated variation in multivariate phenotypic space. Markov chain Monte Carlo methods were used in a Bayesian framework to estimate the G-matrix as half the among-line variance (see Methods) and, for the ancestral population, the G-matrix together with the genetic (co)variances between traits and fitness. Table 1 defines the variables employed.

Table 1
Notation.
VariableDefinition
wrelative fitness in high salt, the self-fertility of hermaphrodites;
from Chelo et al., 2019
qi,jtransition rates between the movement states i and j; see Equation 1
Ggenetic (co)variance matrix of transition rates and body size; see Equation 2
Gqwgenetic (co)variance matrix of transition rates, body size, and self-fertility
Skancestral selection differentials in high salt, with k the salt environment
where traits were measured; last column of Gqw
βgvector of genetic selection gradients; see Equation 6
βvector of phenotypic selection gradients; see Equation 7
SSCPSum-of-Squares and Cross-Product matrices for the environment
and population factors; from MANOVA
dmax1st eigenvector of the population factor SSCP-matrix in high salt
δp1st eigenvector of the environment factor SSCP-matrix, for the ancestral population
gmax1st eigenvector of the ancestral G-matrix, one for each salt environment
emaxfirst eigenvector of the random skewer R-matrix representing the main canonical
trait differentiating the four G-matrices in high salt
Δq¯kMean difference of the GA[1,2,4]50 populations from A6140,
with k the salt environment where traits were measured; from MANOVA
λieigenvalue of the ith eigenvector
Θthe angle between eigenvectors of ancestral genetic variation and
δp,dmax, or emax; see Equation 3
Πproportion of G-matrix variance along δp, dmax, or emax; see Equation 5

Ancestral population

Phenotypic plasticity between salt environments

Before experimental evolution to high salt, we started by characterizing phenotypic and genetic variation in low and high salt environments in the ancestral domesticated population. We find extensive phenotypic plasticity for locomotion behavior traits and body size (Figure 1, Table 2). Because we employed univariate and multivariate approaches to model phenotypic plasticity (see Methods), we compared the estimated environmental effects using both approaches. We find that univariate and multivariate modeling approaches give similar results (Figure 1—figure supplement 1, Figure 1—source data 3, Figure 1—source data 4). Most transition rates are plastic with salt, except from forward to backward movement states. As expected, body size shrinks in high salt.

Figure 1 with 1 supplement see all
Phenotypic plasticity of the ancestral population.

Gray dots indicate the trait values (BLUPs) estimated for each inbred line in the low and high salt environments: F for ‘forward,’ B for ‘backward’, and S for ‘still,’ left to right order indicating movement direction. Gray circles and bars indicate the mean 95% confidence intervals least-square estimates using the univariate approach (see Methods). Significant differences between environments are indicated with a line, when using the multivariate approach (Table 2). Figure source code is linked here - Multivariate analysis of variance (MANOVA) and figures/tables export scripts.

Figure 1—source data 1

Raw data for analysis including all design and environmental covariates.

https://cdn.elifesciences.org/articles/80993/elife-80993-fig1-data1-v2.txt
Figure 1—source data 2

Sample sizes, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-fig1-data2-v2.pdf
Figure 1—source data 3

Multivariate analysis of variance (MANOVA) results, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-fig1-data3-v2.pdf
Figure 1—source data 4

Multivariate analysis of variance (MANOVA) results for the ancestral population by each trait, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-fig1-data4-v2.pdf
Figure 1—source data 5

Eigendecomposition of the MANOVA SSCP matrix for the environment factor, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-fig1-data5-v2.pdf
Table 2
MANOVA results for ancestral phenotypic plasticity and phenotypic differentiation.
FactorDfWilksapprox.Fnum.DFden.DfProb(>F)
Environment10.137523.475832.20E-16
Population30.31040.2211674.62.20E-16
Environment:Population30.8285.4211674.63.99E-14
Residuals589
  1. Notes: The Environment factor refers to the phenotypic difference between high and low salt environments for the ancestral population (Figure 1). The Population factor refers to the phenotypic differences between the four populations in the high salt environment (A6140 and GA[1,2,4]50); (Figure 6). The interaction between Environment and Population refers to the change in phenotypic difference between the four populations between the two environments, that is, to the evolution of phenotypic plasticity (Figure 6—figure supplement 2). The intercept in this MANOVA model is the ancestral population trait values in the high salt environment. Full model results, including the effects of assay design and environmental covariates (block, temperature, density, etc.), can be found in Figure 1—source data 3.

The multivariate approach (MANOVA) allows us to determine the phenotypic dimension of ancestral phenotypic plasticity (δp) that most responds to salt environmental variation (see Methods) as the first eigenvector of the MANOVA SSCP-matrix for the environment factor (Figure 1—source data 5). Transition rates from still to forward or to backward (SF or SB) have opposite loading signs in δp (Table 3). Body size has the same sign of the transition rates from still and from forward to backward (SB and FB) and the opposite sign relative to the other transition rates.

Table 3
Canonical traits of ancestral standing variation, divergence, and selection in high salt.
Traitδp(1)gmax(1)g2(1)g3(1)dmax(1)emax(1)βg(2)mmax(3)m2(3)
SF0.148–0.360–0.3880.241–0.225–0.331–0.93–0.402–0.125
SB–0.103–0.459–0.4090.394–0.365–0.5020.93–0.224–0.209
FS0.1610.2670.3030.1290.2840.3780.350.6070.257
FB–0.0390.532–0.502–0.1170.5170.4590.320.632–0.518
BS0.0940.1420.1530.0350.2530.145–0.960.0770.106
BF0.0620.467–0.4960.1500.4730.498–0.82–0.096–0.765
Size–0.963–0.258–0.264–0.856–0.425–0.1320.830.069–0.102
99%76%9%6%95%--39%37%
  1. Notes: (1) trait loadings of eigenvectors defined in Table 1, for the high salt environment; (2) modes of genetic selection gradients posterior distributions from Figure 9; (3) trait loadings of the first two eigenvectors of the mutational (co)variances matrix in low salt, re-analysis of locomotion behavior data with body size, from mutation accumulation lines reported in Mallard et al., 2023 (see Discussion). The bottom row shows the percent variation each eigenvector explains, when relevant.

Table 3—source data 1

Eigendecomposition of environmental effects in the ancestral population, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-table3-data1-v2.pdf
Table 3—source data 2

Eigendecomposition of the high salt G-matrix, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-table3-data2-v2.pdf
Table 3—source data 3

Eigendecomposition of phenotypic differentiation, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-table3-data3-v2.pdf
Table 3—source data 4

Genetic selection gradients for traits measured in high salt, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-table3-data4-v2.pdf
Table 3—source data 5

Eigendecomposition of the mutation variance-covariance matrix, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-table3-data5-v2.pdf

Standing genetic variation

We next partitioned the phenotypic (co)variances among the inbred lines of the ancestor population into the genetic (G-matrix) and the residual error (co)variances of transition rates and body size (see Methods). Ancestral G-matrices were estimated separately by salt environment, assuming genome-wide homozygosity and no directional non-additive genetic effects. Our estimates are robust to changes in the prior distributions (Figure 2—figure supplement 1).

We find significant genetic variance for all transition rates and body size, in high and low salt environments, except for the transition rate from backward to still in low salt (Figure 2, Figure 2—figure supplement 2). The genetic covariances between transition rates and/or body size show similar values, albeit of lower magnitude in low salt (Figure 2A). The G-matrix size is similar between environments (Figure 2B). In both high and low salt environments, transition rates from still to forward or to backward (SF or SB) are negatively correlated with the other transition rates and positively correlated with each other. Body size shows positive genetic covariances with SF and SB.

Figure 2 with 3 supplements see all
G-matrix of the ancestral population in low salt and high salt environments.

(A). The bottom seven estimates indicate the genetic variances in transition rates and body size, top 15 estimates are the genetic covariances between the seven traits. (B). Total genetic variance in each environment is the trace of the G-matrices (C). Eigenvalues of the six eigenvectors for each G-matrix. For all panels, red (gray) indicates estimates in low (high) salt, with dots, and colored intervals the mode and the 83% or 95% credible intervals of the posterior distribution.

Eigendecomposition of the ancestral G-matrix in high or low salt reveals a similar structure between them (Figure 2C, Figure 2—source data 2). The first canonical trait (gmax, Table 3) encompasses most genetic variation (75% in high salt, 63% in low salt). The next two canonical traits contain less genetic variation (between 6% and 20%) but are larger than null expectations (Figure 2—figure supplement 3). Only the first two canonical traits of the G-matrix have a similar trait loadings between environments (Figure 2—source data 2).

Ancestral plasticity and genetic variation are not aligned

We compared the main canonical trait of phenotypic plasticity with the canonical traits of the high salt G-matrix. Phenotypic plasticity is not aligned with the G-matrix (Figure 3). This is because the amount of genetic variance along the dimension of phenotypic plasticity (δp) is not different than that expected by chance (Figure 3A), and also because the angle between δp and gmax is, if anything, larger than expected by chance (Figure 3B). δp appears to similarly summarize environmental variation as the third canonical trait from the high salt G-matrix (Table 3), a trait that encompasses only 6% of standing genetic variation. As noted before, these differences stem from the association between still-to-forward and still-to-backward transition rates (SF and SB), which are genetically positive and environmentally negative, i.e., they have opposite signs in gmax and δp, respectively. We suspect that positive associations between SF and SB reveal more dwelling, while negative association more individual foraging (Flavell et al., 2020).

Aligment between phenotypic plasticity and standing genetic variation in high salt for the ancestral population.

(A) Projection of the high salt G-matrix along the phenotypic plasticity canonical trait δp. Dots show the mean estimate with bars the 83% and 95% credible interval of the posterior G-matrix distribution. Orange bar shows the null 95% CI of the posterior distribution of modes of 1000 G-matrix randomized by inbred line and block identities (see Methods). (B). The angle (Θ, Equation 3) between δp and the first three eigenvectors of the ancestral G-matrix (gmax, g2, and g3). Θ does not differ from the random expectations. Dots show the mean estimate with bars the 83% and 95% credible interval of the posterior G-matrix distribution. The null expectation was obtained by computing the angle between pairs of random vectors sampled from a uniform distribution (see Methods).

Selection differentials are similar across environments

Selection differentials on transition rates and body size measured in the two salt environments are their genetic covariances with the self-fertility measured in the high salt environment (sk, Table 1). We used estimates of the ancestral Gqw-matrices to obtain these selection differentials (see Methods). G-matrices and Gqw-matrices estimates of genetic (co)variances in locomotion behavior and body size are similar (Figure 4—figure supplement 1). We find genetic variance for self-fertility in high salt (Figure 4, Figure 4—source data 1). We also find that the transition rates from still to forward or from still to backward (SF or SB) measured in high salt have positive genetic covariances with self-fertility, and all other transition rates have negative covariances (Figure 4). Small or no selection differentials exist in low salt transition rates, and only body size shows a clear positive selection differential in both environments. These results are robust to variation in self-fertility (Figure 4—figure supplement 2).

Figure 4 with 2 supplements see all
Selection differentials in the ancestral population.

Ancestral genetic covariances between transition rates and body size measured in high salt (gray) or low salt (red) with high salt self-fertility. Dots and colored intervals show the mode and the 83% or 95% credible intervals of the posterior Gqw distribution.

Evolutionary divergence

Adaptation to the high salt environment

Having characterized ancestral standing variation, we describe the divergence of the three replicate populations after 50 generations of evolution. First, we measured the degree of adaptation to high salt by comparing the mean fitness of the GA[1,2,4]50 populations to the ancestral A6140 population in competition experiments against a tester GFP-strain (see Methods; Figure 5—source data 1). We find an increase in the mean fitness of all three replicate populations (Figure 5, Figure 5—source data 2). By generation 50, populations adapted to the high salt environment.

Adaptation to the high salt environment.

Colored dots show the ratio of wild-type to green-fluorescent protein (GFP) alleles after one generation of pairwise competitions between the outbred experimental populations with a GFP-tester strain. Filled circles indicate the least-square mean estimates with 95% confidence intervals; asterisks indicate significant differences between each replicate population relative to the ancestral population.

Locomotion traits diverged in low and high salt environments

Concomitant with adaptation, there was phenotypic divergence for the locomotion traits and body size measured in high salt (Figure 6, Table 2). Estimates of phenotypic divergence are robust to multivariate and univariate modeling (Figure 6—figure supplement 1, Figure 6—source data 1). From the univariate models, we find that for each transition rate, at least one replicate GA population differed from the ancestor and that the three replicates showed significant divergence for three transition rates (Figure 6, Figure 6—source data 2). For body size we find that only one replicate populations diverged from the ancestral population. The amount of genetic variance did not limit phenotypic divergence, as the back-to-still and forward-to-still transition rates (BS and FS) diverged while showing the lowest genetic variances in the ancestral population (Figure 2). Eigendecomposition of the MANOVA SSCP-matrix for the population factor further reveals that a single canonical trait explains most phenotypic differentiation between the four populations (dmax; Figure 6—source data 3, Table 3).

Figure 6 with 2 supplements see all
Phenotypic divergence in the high salt environment.

Each panel shows the transition rates and body size as in Figure 1. Dots indicate the values estimated for each inbred line in a high salt environment, gray for the ancestral population, blues for the evolved replicate populations. Circles and bars indicate the mean and the 95% confidence intervals least-square estimates. Line shows significant differentiation between all four populations using the multivariate MANOVA approach (Table 2). Significant differences between each of the evolved populations and the ancestral population using the univariate approach are shown with asterisks (Figure 6—source data 2).

In the low salt environment, there was less phenotypic divergence than in the high salt environment, with only three out of the six transition rates having at least two replicate populations significantly different from the ancestor (Figure 6—figure supplement 2). Unlike in high salt, body size in low salt showed a marked increase after experimental evolution in all replicate populations.

Genetic variance decreased during evolution

We next characterized genetic divergence by estimating the high salt G-matrices of the GA[1,2,4]50 populations and comparing them with the ancestral high salt G-matrix. We did not model the evolution of the G-matrix in the low salt environment. This analysis shows that the size of the high salt G-matrix was reduced during experimental evolution, independently of the replicate population (Figure 7, Figure 7—source data 1). However, we continue to find that most genetic variances for the GA populations differ from null expectations (Figure 7—figure supplement 1). Eigendecomposition of the GA G-matrices indicates that 3–5 canonical traits differ from null expectations (Figure 7—figure supplement 2). Furthermore, evolved populations continued to have significant genetic variances in the three canonical traits of ancestral standing genetic variation (Figure 7—figure supplement 3).

Figure 7 with 3 supplements see all
Genetic divergence in the high salt environment.

(A) High salt G-matrix evolution of ancestral (gray) and evolved GA populations (blues). Eigendecomposition of the ancestral G-matrix (gray) can be found in Figure 2, those of the evolved GA populations in Figure 7—figure supplement 2. (B) Total G-matrix variance for each experimental population. (C) Genetic variance along emax, the main canonical trait of genetic differentiation obtained after the random skewers analysis (see Methods, Table 1). Dots and colored bars show the mode and the 83% or 95% credible intervals of the posterior distribution. Figure 7 sources linked here - matrix computation, random skewers analysis, and Figure 7 scripts. The Figure 7 scripts also produces all three figure supplements.

Figure 7—source data 1

G -matrices of evolved populations in the high salt environment, see table.

https://cdn.elifesciences.org/articles/80993/elife-80993-fig7-data1-v2.pdf

Random skewers analysis shows that when projecting the four high salt G-matrices along 1000 random phenotypic directions, 250 of them showed a significant difference between the ancestral and at least one of the evolved populations (see Methods). Using these 250 vectors we build an R-matrix of genetic divergence between the four populations. Eigendecomposition of the R-matrix then revealed that a single canonical trait explains most divergence (emax, Table 1). In this canonical trait at least 2 of the 3 evolved populations showed a reduction in variance (Figure 7C). An alternative eigentensor analysis to detect genetic divergence among the 4 populations confirms the random skewers analysis (see results in our GitHub appendix).

Divergence along ‘genetic lines of least resistance’

We asked whether phenotypic divergence (dmax) and genetic divergence (emax) occurred along the dimensions of most ancestral genetic variation (gmax). For this analysis, we calculated the angle (Θ, see Equation 3 in Methods) and the proportion of overlap (Π, see Equation 5) between these canonical traits. Table 3 summarizes the main canonical traits of ancestral standing variation, and of phenotypic and genetic divergence.

Most of the genetic variance of the ancestral high salt G-matrix along dmax is higher than expected by chance (Figure 8A). This is because dmax and gmax are aligned and their angle is very small when compared with other canonical traits of ancestral standing genetic variation, or with a null expectations (Figure 8B). We do not find a significant proportion of genetic variance of the high salt G-matrix along emax (Figure 8C), but find a small angle between emax and gmax (Figure 8D), which is indicative of a good alignment between these canonical traits.

Phenotypic and genetic divergence alignments with ancestral standing variation.

(A) Projection of the total ancestral genetic variance along the phenotypic divergence canonical trait dmax. Dots show the mean estimate with bars the 95% CI. Orange bar shows the null 95% CI after randomizing the G-matrix (see Methods). Mean of the observed posterior distribution (0.93) is outside the 95% CI of the randomized posterior modes (0.80–0.91). (B). The angle (Θ) between dmax and the first three eigenvectors of the ancestral G-matrix (gmax,2,3). The null expectation was obtained by computing the angle between 1000 pairs of random vectors. (C and D) Similar projection and angles as shown in (A) and (B) but with emax - the vector of the main genetic divergence - instead of dmax. In (C), the null and observed projections do not differ. Because emax and gmax are almost aligned, both the observed and the null are very close to one (as Π is estimated relatively to λmax, see Equation 5) and the relative phenotypic variance between traits is conserved in the randomized G-matrices.

Indirect selection and predicting phenotypic evolution

Expected and observed responses to selection are aligned in high salt

Using Lande’s retrospective equation, we compared the genetic selection gradients obtained with selection differentials on traits measured in high salt (βg; see Methods, Equation 6) to the phenotypic selection gradients obtained with the observed responses in high salt after 50 generations (β; see Methods, equation 7). The ancestral population’s high salt G-matrix was assumed stable during experimental evolution. Credible intervals were obtained, however, by sampling the G-matrix from its posterior distribution, with fixed ancestral selection differentials (sk; Figure 4), or fixed observed phenotypic divergence for each of the three replicate populations (Δq¯k; Figure 6 for high salt, Figure 6—figure supplement 2 for low salt).

We find that phenotypic selection gradients for all traits are highly heterogeneous because higher phenotypic divergence has an outsize effect on the mean and error estimates. Nonetheless, the phenotypic selection gradients overlap with the corresponding genetic selection gradients for at least one replicate population (Figure 9A; Figure 9—source data 1, Figure 9—source data 2). Evidence for a lack of overlap between selection gradients in two replicate populations is found for the transition rate backward-to-forward and for body size. Only for the transition rate SB is there an overlap of selection gradients for all three replicate populations.

Figure 9 with 2 supplements see all
Predicting phenotypic evolution with Lande’s equation.

(A) Indirect and direct selection. Genetic selection gradients βg (gray) and phenotypic selection gradients β (blues) for each replicate population, see Equation 6 and Equation 7, respectively. β were divided by 3.5 for scaling (the average ratio observed/predicted divergence, panel C) rather than by 140 (the total number of generations in the experiment) for visual convenience. (B) The direction of phenotypic evolution. Angle between the expected phenotypic divergence (selection differentials, sk; Figure 4) and the observed phenotypic divergence at each replicate (Δq¯k; Figure 6). Circles show the results in the high salt environment and stars in the low salt environment. The expected angle by chance is in gray and was generated by computing 1000 angles between pairs of randomly generated vectors from a uniform distribution U7(1,1). (C) The magnitude of phenotypic evolution. The ratio phenotypic divergence at each replicate (Δq¯k) with expected divergence (sk). For all panels, dots/circles/stars and colored bars show the mode and the 83% or 95% credible intervals of the posterior distributions obtained by sampling in posterior distribution of the ancestral high salt G-matrix (Figure 2).

Given replicate heterogeneity, testing whether selection theory predicts the direction of phenotypic evolution is possibly best estimated as the angle between expected and observed responses to selection. For traits measured in high salt, we find that the angle between selection differentials and observed phenotypic divergence is low for all replicates (Figure 9B). In contrast, when traits are measured in low salt, expected and observed responses are not aligned. Similarly, whether theory predicts the magnitude of phenotypic evolution can be estimated as the ratio between expected and observed responses. For all traits measured in high salt, and across replicate populations, observed phenotypic divergence is on average, across traits and replicates, 3.5 times the selection differentials (Figure 9C). For the traits measured in low salt, predictions about the magnitude of divergence further degenerate (not shown).

Indirect versus direct selection

There is evidence of direct selection on SB and backward-to-forward (BF), as well as on body size (Figure 9A, Figure 9—source data 1). We find a positive genetic selection gradient for SB and a negative genetic selection gradient for BF. There is also a positive selection gradient for body size, opposite in sign to its association with other traits in the canonical traits of standing genetic variation, and in the canonical traits of phenotypic and genetic divergence (Table 3). Traits with low genetic variance, FS and BS (Figure 2—figure supplement 2), do not appear to bias the genetic selection gradient estimates (Figure 9—figure supplement 1). However, none of the genetic selection gradient estimates differ from zero when sampling both the G-matrix and the selection differentials from their respective posterior distributions to obtain credible intervals (Figure 9—figure supplement 2).

Discussion

Modeling approaches for predicting adaptive phenotypic evolution before mutation-selection balance is reached are based on Lande’s equation (see Introduction). In Lande’s equation, the trait change over one generation equals the ancestral G-matrix times the directional phenotypic selection gradients, but might not be accurate in the presence of indirect selection. This can be remediated by replacing the phenotypic selection gradients with the genetic selection gradients obtained after measuring selection differentials in the ancestral population. For predicting phenotypic evolution over several generations, however, one must assume invariable selection gradients and that the G-matrix is stable despite selection and drift. We sought to test the selection theory by finding whether we could predict phenotypic evolution for 50 generations of experimental evolution.

We followed seven traits with different environmental and genetic dependencies, the six transition rates between movement states and body size (Table 3). Our ancestral population was adapted to the low salt conditions (Chelo and Teotónio, 2013; Theologidis et al., 2014), before challenging three replicate populations to a gradual increase in the salt concentration in the growth media for 35 generations and 15 extra generations in high salt. Fifty generations of experimental evolution led to adaptation (Figure 5) and to phenotypic and genetic divergence (Figure 6, Figure 7). Body size measured in high salt did not consistently evolve among replicate populations but individual movement increased. This is because the transition rates from the still state have increased while those to the still state have decreased. Adaptive phenotypic divergence followed the direction of the canonical trait with more ancestral standing genetic variation (Figure 8, Table 3), and therefore, we could predict phenotypic evolution, though only its direction and when the component traits of the multivariate phenotype were measured in the high salt environment (Figure 9). When considering the component traits of the multivariate phenotype individually, and due to replicate population heterogeneity, we could confidently predict the evolution of only one of the seven traits followed (SB). We could not predict the magnitude of evolution for any individual trait. These findings are relatively unique because we described G-matrix evolution and measured the ancestral selection differentials to predict phenotypic evolution, but they add to the results of a growing number of experimental studies testing Lande’s equation across tens of generations. For example, a recent re-analysis of up to 60 generations in constant and homogeneous environments, for five wing traits in Drosophila melanogaster, showed an alignment between the main canonical trait of genetic variation in the evolved populations with adaptive phenotypic divergence (Yeaman et al., 2010; Walter, 2023).

As the canonical trait explaining most variation in the ancestral population, gmax, also the second and third canonical traits differ from null expectations (g2,3, Figure 2, Figure 2—figure supplement 3), and remain so after evolution (Figure 7—figure supplement 3), despite potential variance inflation problems due to the MCMC methods we employed (Morrissey et al., 2012; Sztepanacz and Blows, 2017b). For these ancestral canonical traits, selection must have been responsible for the observed loss of genetic variance, particularly after generation 35 (Figure 7). Assuming an infinitesimal model of trait inheritance, drift is expected to lead to a loss of genetic variance by (1-1/2Ne) at each generation (Barton et al., 2017). Even in the unrealistic situation of complete selfing during the experiment, and considering effective population sizes on the order of 1000 (Chelo and Teotónio, 2013), less than 5% was expected to be lost by genetic drift by generation 50, values that were not observed (only about half of the genetic variance was lost, Figure 7B). Supporting loss of variation by selection we before showed that allelic diversity at neutral single-nucleotide polymorphisms is reduced relative to ancestral levels by 5% by generation 22 (Theologidis et al., 2014), and to 20% only by generation 50 (Chelo et al., 2019). We did not, however, test whether the direction of phenotypic divergence occurred along the ancestral g2,3 traits because they together explain 15% of the variation and there is a poor statistical power to do so. Furthermore, our previous work suggests that variation in the ancestral g2,3 might have been lost by drift during the first 35 generations of the experiment such that they were perhaps of little consequence later on when populations reached the high salt environment (Guzella et al., 2018). Few studies have demonstrated that canonical traits of little genetic variance can influence selection responses (Kirkpatrick, 2009; Blows and McGuigan, 2015; Sztepanacz and Blows, 2017a). In one of the few examples, Hine et al., 2014 showed that low-variance canonical traits of eight cuticular hydrocarbons in Drosophila serrata respond to artificial selection during six generations, though inconsistently among replicate populations. In our experiment, we suspect that showing that canonical traits with a small amount of genetic variation impact adaptive phenotypic divergence will require finding the quantitative trait loci (QTL) responsible for their expression (Svensson et al., 2021; Kelly, 2009). If these low-variance canonical traits influence adaptation then allele frequency dynamics at the relevant QTL because of selection might be detected when comparing genomic data between ancestral and derived populations (Long et al., 2015; Barghi et al., 2020).

Our findings also highlight the relationship between phenotypic plasticity and adaptation to novel environments (Ghalambor et al., 2007; Pfennig et al., 2010; Teotónio et al., 2009; Draghi and Whitlock, 2012; Noble et al., 2019). While the discussion has been on showing that population persistence is more likely if plasticity is aligned with the direction of selection (Price et al., 2003; Lande, 2009; Chevin et al., 2010), our results show that plasticity only reveals the topography of the adaptive landscape. In high salt conditions, populations move away from the ancestral phenotypic optimum (Figure 1), with an associated fitness cost (Theologidis et al., 2014). Adaptation to the high salt target environment after generation 35 presumably involved recovering a phenotype similar to that of the ancestral population that alleviated this fitness cost (Table 3). In particular, high salt in the ancestral population reduces body size and SB transition rates while increasing SF transition rates. Symmetrically, selection favors increased body size, increased SB, and decreased SF. However, because all three traits show positive genetic covariances with each other (Figure 2), even if plasticity is oriented with selection (but of the opposite sign), phenotypic evolution is constrained by a lack of genetic variation in the appropriate canonical trait. The ancestral population had genetic variation in the direction of selection (the canonical trait g3, Table 3), but as argued above it was probably lost during gradual salt evolution because of drift such that when reaching high salt populations could not have further responded to selection (Matuszewski et al., 2015; Guzella et al., 2018). Future evolution in the direction of the ancestral multivariate phenotypic optimum, or close to it, should then be conditional on the appearance of de novo pleiotropic mutations. Assuming that mutational covariances do not vary with the environment, it is unclear that there can be much further phenotypic evolution as elsewhere we characterized mutational covariances and did not find any in the direction of selection (see Table 3 and Mallard et al., 2023).

Ancestral phenotypic plasticity can thus be considered ‘non-adaptive’ (Ghalambor et al., 2007). Hence, it is unsurprising that we could not predict low salt phenotypic evolution. This is not explained because of a general lack of genetic variance in locomotion traits and body size in the ancestral population (Figure 2) but because their selection differentials were small or did not differ from zero (Figure 4). An exception is body size. Body size is reduced by high salt conditions (Figure 1), and there is probably direct selection for increased body size in high salt (Figure 9). However, there was little body size evolution in the high salt environment (Figure 6). In contrast, there was evolution of increased body size when it was measured in the low salt environment, as expected from the positive selection differentials in the ancestor population (Figure 4). Body size measured in low salt could only have evolved because of indirect selection by being genetically correlated with high salt body size or some other traits expressed in high salt. Hence, the evolution of plasticity in body size might be predictable although the evolution of plasticity in the multivariate phenotype is not. The fact that body size measured in low salt evolved further reveals that the evolved populations were in a region of the adaptive landscape that was not readily accessible to the ancestral population even if it had been domesticated to low salt conditions for 140 generations. In other words, body size evolution in low salt supports the long-held hypothesis that phenotypic plasticity facilitated the evolution of novelty (Wagner and Lynch, 2010; Moczek et al., 2011; Levis et al., 2018).

Robertson’s covariance [σg(z,w)] was originally found to describe an episode of selection (the selection differentials, s) and later to predict adaptive phenotypic evolution over one generation (Δz¯) as the secondary theorem of natural selection (Walsh and Lynch, 2018; Hajduk et al., 2020). Applications so far have been mostly limited to explaining the evolution of individual traits and when time-series of trait and fitness are concurrently obtained with pedigrees such that estimates of trait breeding values and genetic selection gradients can be updated every few generations (Stinchcombe et al., 2014; Morrissey et al., 2012; Hajduk et al., 2020; Hadfield, 2010). One problem that has received particular attention has been to ask whether indirect selection or phenotypic plasticity and robustness explain phenotypic stasis, despite trait heritability and a significant phenotypic selection gradient on the observed trait (Merilä et al., 2001; Kruuk et al., 2002). For example, Biquet et al., 2022 followed a blue tit population for more than 40 years, in which egg laying has changed to earlier spring dates, presumably because of climate change. Despite significant heritability of egg laying and directional selection for earlier dates, modeling the breeding values did not reveal any temporal trend, consistent with a lack of genetic covariance between egg laying date and fitness. There was thus no genetic divergence for laying date, which Biquet et al., 2022 could attribute to phenotypic robustness and the stochastic nature of individual development to maturity. Conversely, using a similar approach, Bonnet et al., 2017 found that evolution towards earlier parturition dates in a red deer population could be predicted and was consistent with the estimated change in breeding values. In a study following two traits, selection due to human harvesting of a prey species of wild salmon (for feeding domestic salmon) explained divergence in early maturity and small body sizes, despite directional selection for increased body size at maturity because of fishing (Czorlich et al., 2022). Our results suggest that when pedigrees are difficult to obtain, predicting the direction of adaptive multivariate phenotypic evolution for tens of generations may be possible without updating estimates of selection gradients and the G-matrix every few generations. This is because although the environment changed for 35 generations in our experiment directional selection was maintained and only the size of the G-matrix was reduced.

In sum, we have shown that using Lande’s equation with genetic selection gradients is valid to predict the direction of phenotypic evolution in a new environment, after a gradual environmental change for 35 generations and 15 generations in the new environment. However, selection theory not necessarily predict the direction or magnitude of evolutionary change in all the component traits of the multivariate phenotype, especially if the traits are not measured in the new environment. This is because there are variable and complex genetic and environmental dependencies between individual traits. There are few experimental tests of selection theory such as ours. Therefore, more will be needed to generalize our results to natural populations, particularly those challenged by changing and heterogeneous environments.

Materials and methods

Experimental populations and environmental conditions

Request a detailed protocol

The ancestral population is named A6140, where ‘A’ stands for androdioecious, ‘6’ for replicate six, and ‘140’ for the number of generations of domestication to a standard laboratory environment (Teotónio et al., 2017). A6140 resulted from the hybridization of 16 founder wild strains during 33 generations followed by 140 generations characterized by 4 day discrete and non-overlapping life-cycles at N=104census sizes and Ne=103effective population sizes (Teotonio et al., 2012; Chelo and Teotónio, 2013; Theologidis et al., 2014). Our standard laboratory environment involves populations being maintained in 10 × 9 cm Petri dishes NGM-lite agar media containing 25 mM NaCl and a homogenous lawn of E. coli HT115 that served as food from the L1 larval stage until reproduction. Each Petri dish contains 1000 individuals which are mixed during reproduction, with embryos being collected and synchronized at the L1 larval stage to start a new generation.

We report the evolution of locomotion behavior and body size in three independent replicate populations (named GA[1,2,4] populations: ‘G’ for gradual, ‘A’ for androdioecious, ‘#’ for replicate number). They were derived from splitting into three a single pool of at least 104 individuals sampled from the A6140 population. GA populations were maintained in the same conditions as during domestication except that the NGM-lite media was supplemented with 8 mM of NaCl at each generation for 35 generations and then kept at constant 305 mM NaCl for an additional 15 generations. Details about the derivation of the GA populations can be found in Theologidis et al., 2014. We refer to the NGM-lite 305 mM NaCl environment as the ‘high salt’ target environment, while the domestication 25 mM NaCl environment as the ‘low salt’ environment.

C. elegans is an androdioecious nematode, where hermaphrodites can self but outcross only when mated with males. Natural populations are depauperate of genetic diversity and males are rare due to a long history of selfing, selective sweeps, and background selection (Andersen et al., 2012; Rockman et al., 2010). Under the domestication environment, however, outcrossing is readily maintained at frequencies between 60% and 100% (Teotonio et al., 2012, Mallard et al., 2022). In GA populations outcrossing is maintained at close to 100% for 35 generations, reduced to about 30% in GA1, 14% in GA2, and 5% in GA4, by generation 50 of the experiment (Theologidis et al., 2014).

As reported before, we derived inbred lines by selfing single hermaphrodites from the ancestor (A6140) and the three replicate populations at generation 50 (GA[1,2,4]50) for a minimum of 10 generations (Noble et al., 2017; Noble et al., 2021). Male frequency in the inbred lines is low, on the order of the mutation rate for the non-disjunction of the X-chromosome (Teotónio et al., 2006) – sex-determination is chromosomal with hermaphrodites XX and males X –.

Populations and inbred lines were cryogenically stored (Stiernagle, 1999), allowing for contemporaneous measurements of ancestral and evolved outbred populations and their inbred lines. Grandmaternal and maternal environmental effects are common to the samples being measured (Teotónio et al., 2017).

Adaptation to high salt

Request a detailed protocol

We measured the increase in mean relative fitness among the ancestral population (A6140) and evolved populations at generation 50 (GA[1,2,4]50) using pairwise competition experiments between them and a tester line (Teotónio et al., 2017). As a tester, we employed an inbred line (EEV1402) derived by selfing from the A6140 population, and that expressed a green-fluorescent-protein (GFP) morphological marker (Chelo et al., 2013). For the assays, we revived the four populations and the tester line (>1000 individuals each) and let individuals reproduce and starve for 10 days. Starved individuals were then seeded on fresh plates with food at a density of 1000 L1 larvae in low salt. We grew them for two complete generations in high salt, except the GFP tester which was only grown in high salt for one generation. At the third generation, we seeded 500 L1 larvae of the GFP tester line together with 500 L1 larvae of 1 of 4 experimental populations in high salt. For A6140, we seeded 15 plates (technical replicates), for GA150 4 plates, for GA250 four plates, and for GA450 five plates. In each of these plates, 72 hr after L1 seeding, individuals were subject to the ‘bleach/hatch-off’ protocol, the standard of our life-cycle, to recover live embryos and, 24 hr later, synchronized L1 larvae. We scored an average of 169 larvae for GFP expression in each technical replicate.

The relative proportion of non-GFP to GFP measures the relative fitness of the experimental populations to the tester after one generation of competition (Teotónio et al., 2017). To analyze this data, we used a generalized linear model in R (R Development Core Team, 2018), testing for the evolution of the ratio non-GFP/GFP, assuming a binomial error distribution (‘quasibinomial’ family option) and allowing for overdispersion of the data. Post-hoc pairwise comparisons were performed between the ancestral and the evolved populations with Tukey tests using the glht function in the multcomp package in R (Hothorn et al., 2008).

Locomotion behavior

Request a detailed protocol

Inbred lines were thawed from frozen stocks on 9 cm Petri plates and grown until exhaustion of food. This occurred 2–3 generations after thawing, after which individuals were washed, adults removed by centrifugation, and three plates per line seeded with 1000 larvae at mixed larval stages. Samples were then maintained in the standard domestication environment for two complete generations. At the assay generation (generation 4–6 generations post-thaw), starvation-synchronized L1 larvae were seeded in low and high salt. Adults were phenotyped for locomotion behavior 72 hr later at their usual reproduction time in one 9 cm plate (technical replicate). At the beginning of each assay we measured ambient temperature (T) and humidity (H) in the imaging room.

Given the number of lines to phenotype, we repeated the above protocol several times over several years, with each repetition defining a statistical ‘block’ on a given day. In total, we phenotyped 186 lines from the A6140 population and 61, 61, and 42 lines from each of the GA[1,2,4] populations, respectively, with most lines being phenotyped twice and always in separate blocks (average of 1.9 in low salt, and of 2 in high salt).

We imaged adult hermaphrodites using the Multi-Worm Tracker [version 1.3.0; Swierczek et al., 2011 and used the materials and protocols of Mallard et al., 2022]. Each movie contains about 1000 tracks of hermaphrodites (called objects) with a mean duration of about 1 min. Standardized to a common frame rate (4 Hz), we filtered and extracted the number and persistence of tracked objects per movie and assigned movement states across consecutive frames as forward, still or backward (assuming forward as the dominant direction of movement). Mean object density (D) per movie was also retrieved to be used as a covariate in modeling.

Locomotion behavior in 1-dimensional space is described by the transition rates between still (S), forward (F) and backward (B), plus the self-transition rates. Modeling is detailed in Mallard et al., 2022. Transition rates between movement states are assumed to follow a continuous time Markov process. The Markov process is a stochastic process modeling changes in movement state as a matrix Q. In our data, the Markovian memoryless assumption is only marginally violated (Mallard et al., 2022). The elements in Q, noted qi,j, are the transition rates from state i to state j (off-diagonal elements for ij, and with qi,j>0). This definition constrains self-transition rates (diagonal elements) to be of the opposite sign to the sum of the two transition rates leaving the relevant movement state:

(1) qi,i=-jiqi,j

This ensures that the probability of leaving a given movement state towards any other state during a waiting time Δt is one minus the probability of remaining in the same state (see Mallard et al., 2022 for a more detailed explanation). Therefore, only the six transition rates between movement states are mathematically independent and we thus ignore self-transition rates.

For estimation, we used log-likelihood models as defined in Mallard et al., 2022 and specified them with the msm package (Jackson, 2011) in RStan [Stan Development Team, 2018, R version 3.3.2, RStan version 2.15.1]. Because qi,j>0, all analyses were performed on the natural log scale to ensure normality. We used multi-log normal prior distributions with the mean transition rate and a coefficient of variation ln(qi,j)N(ln(2),0.6). We retained the means of the posterior distributions as the per-plate transition rates for all the subsequent analyses.

Body size

Request a detailed protocol

We included the measurements of body size obtained from the Multi-Worm Tracker movies as a seventh trait. Movie frames were sampled only for forward tracked-objects to minimize posture variation. We then extracted the per-track object mean area (Swierczek et al., 2011). These values were summarized as the per-plate median of all track mean values. We then re-scaled these measurements so that the averaged phenotypic variance in each environment is roughly similar to the average transition rates phenotypic variance. This was done by multiplying the body size by a factor of 50. We chose this procedure rather than dividing all the phenotypic values by their mean, cf. Houle et al., 2011, because our transition rates’ means are close to zero while spanning both negative and positive values. Dividing these transition rates by their means would lead to an artificial increase in phenotypic variance.

Self-fertility

Request a detailed protocol

To estimate selection differentials, we used previously-published data on hermaphrodite self-fertility of the A6140 inbred lines in high salt (Chelo et al., 2019). Self-fertility was measured under environmental conditions that closely followed those of experimental evolution. An average of 42 hermaphrodites were scored for self-fertility per inbred line (minimum 22 and maximum 85 individuals). Self-fertility includes the fecundity of hermaphrodites at the usual time of reproduction by selfing and the viability of their progeny until the L1 larval stage. The log-transformed, covariate-adjusted self-fertility values (best linear unbiased prediction estimates, BLUPs) for each inbred line were downloaded from Chelo et al., 2019, exponentiated, and divided by the mean to obtain a proxy for relative fitness (noted w; Table 1).

Phenotypic plasticity and phenotypic divergence

Request a detailed protocol

We used a multivariate analysis of variance (MANOVA) to model ancestral phenotypic plasticity and the divergence of locomotion behavior and body size. The six transition rates and body sizes were fitted as a multivariate response variable, with fixed effects of temperature and humidity at the time of movie recording and the log of object density in each Petri plate. These three environmental variables were centered and standardized before the analysis to mean=0 and sd=1. We further modeled a fixed effect of block and a fixed effect of year accounting for when the different lines were measured. The main factors of interest were the fixed effects of the salt environment and the fixed effects of evolution, together with their interaction; the last factor with four population ID levels (A6140, GA[1,2,4]50). The residual error was assumed to follow a multivariate normal distribution. We used the manova function in the stats package in R for computation (R Development Core Team, 2018), with Wilks tests are being used for the significance.

From the MANOVA results, we extracted the Sums-of-Squares and Cross-Products (SSCP) matrices for the fixed effects of environment and population and eigendecomposed these matrices to describe the orthogonal canonical traits maximizing phenotypic variation in each (Walter, 2023). For the SSCP-matrix of the environment, the first eigenvector is the dimension containing the most phenotypic plasticity in the ancestral population and is here named δp. For the SSCP-matrix of evolution, the first eigenvector is the dimension of divergence among the four populations in high salt and is here called dmax are the eigenvalues measuring the variation explained by each eigenvector. Estimated mean-least square divergence per replicate population is here called Δq¯k, with k being the environment.

Additionally, we modeled the traits individually using linear mixed-effects models to estimate the best linear unbiased predictions (BLUPs) of transition rates and body size per inbred line (used only for visualization purposes in the figures). This univariate approach allowed testing the divergence of transition rates for each replicate GA population from the ancestral A6140 population. This univariate model was similarly formulated as the MANOVA, except the block was included as having random effects. For model fitting, we employed the lme4 package in (Bates et al., 2015) in R. Post-hoc pairwise contrasts employed Tukey tests with the emmeans package (Lenth, 2021).

G-matrices and genetic divergence

Request a detailed protocol

Using the same model, we estimated the G-matrices of the ancestral population A6140 and the three evolved replicate populations GA[1,2,4] separately for the traits measured in the low and high salt environments.

The six transition rates and body size were fitted as a multivariate response variable column-vector y in the model:

(2) y=μ+n=17α×[T,H,D]+γ+ζ+η+ϵ

where μ are the intercepts and α are the environmental fixed effects of temperature (T), humidity (H), and log density (D). We denote [T,H,D] to simplify notation of the product (×) among the environmental variables (fitting all three variables as fixed effects, the three two ways interactions, and the three-way interaction for a total of seven fixed effects). γ was defined as the fixed effect of year when the assays were conducted, ζN(0,σ2) and ηN(0,σ2) the random effects of line and block identity, respectively. ϵN(0,σ2) defines the residual error.

The G-matrix is half the line identity (co)variance matrix (ζ), as we have measured homozygous diploid inbred lines and assume codominance. As estimated here, the broad-sense G-matrix should be an adequate surrogate for the narrow-sense G-matrix. This is because there is no inbreeding depression for self-fertility in high salt due to the self-fertilization of hermaphrodites from the experimental outbred populations (Chelo et al., 2019), and because, at least in low salt, we failed to detect average (genome-wide) directional dominance or epistasis when comparing the means of transition rates in the outbred populations with those among the inbred lines (Mallard et al., 2022).

Models were fit with the R package MCMCglmm (Hadfield, 2010). We used improper flat priors (nu=0). Model convergence was verified by visual inspection of the posterior distributions and an autocorrelation below 0.05. 100,000 burn-in iterations were done with a thinning interval of 2000 and over 2 million MCMC iterations. The A6140 G-matrix in high salt was estimated using different prior distributions, chosen among the literature as the most representative including parameter-expanded priors (see Figure 2—figure supplement 1).

Because the variance estimates resulting from the MCMCglmm models are positive definite, null expectations for the G-matrices were obtained by randomizing 1000 times the phenotypic data set. Randomization was done by shuffling inbred line and block identities and refitting the model at each iteration (Equation 2). We then computed the posterior mode for each of the 1000 models to construct a null distribution of genetic variances.

Eigendecomposition of each G-matrix was done in R as above for phenotypic (co)variances. We define the main canonical dimension of genetic variation gmax as the first eigenvector of the A6140 G-matrix (with λgmax its eigenvalue). We calculate the angle between the two gmax in high and low salt as the mean of the estimated posterior distribution modes (this angle is defined in the next section, see Equation 3).

We used the random skewers method described by Aguirre et al., 2014; Hine et al., 2009 to describe the genetic divergence during experimental evolution. In this method, random vectors are projected through the four G-matrices to estimate the genetic variance in all phenotypic directions (Equation 5, see below). Using the G-matrix posterior distributions, we tested for significant differences in genetic variance between matrices for each random vector. The vectors that showed a significant difference (i.e. no overlap between the 95% CI of the two matrices projected variance) were retained to construct an R-matrix with the (co)variances of differentiation. The eigendecomposition of the R-matrix then describes the canonical traits of genetic differentiation among the G-matrices. The first eigenvector the R-matrix is here called the vector of genetic divergence (emax, Table 1) because the A6140 ancestral population drives most differentiation. An alternative G-matrix differentiation analysis can be done with the eigentensor approach (Aguirre et al., 2014). Eigentensor analysis of the four A6140, GA[1,2,4]50 G-matrices in high salt gave similar results (see methods and results in the GitHub appendix).

Selection differentials

Request a detailed protocol

For the ancestral population A6140, we also computed the Gqw-matrices as defined in Stinchcombe et al., 2014, which is the G-matrix of the 6 traits of locomotion behavior and body size expanded to include self-fertility. The last column-vector entries of the Gqw-matrix are thus the covariances between traits and relative fitness, the selection differentials (sk).

Different individuals in separate assays were measured for self-fertility and transition rates/body size. To assess for a statistical bias on selection differential estimates when using self-fertility BLUP estimates (Hadfield et al., 2010), we generated 500 ancestral Gqw-matrices in high salt with within-line self-fertility variability across the replicated measurements of transition rates. For each transition rate measurement (one per Petri dish, see above), one inbred line self-fertility value was sampled from a normal distribution using the line’s mean and, as standard deviation, the standard error of the mean multiplied by 2. In each line mean self-fertility was calculated from multiple individuals (at least 22 and up to 85) and on average there are two transition rate values per line. Our protocol thus mimics a random split of self-fertility into two groups of identical size. The Gqw-matrix is stable to within-line self-fertility variation and subsequent analysis was done with the initial Gqw-matrix estimates. We also ensured that the G-matrix contained in the Gqw-matrix is similar to the one computed above for the ancestral population.

Phenotypic and genetic alignments

Request a detailed protocol

We used the metrics introduced by Noble et al., 2019 to compare the alignment of ancestral standing genetic variation with the first canonical dimension of phenotypic plasticity (δp), or with the first canonical dimensions of adaptive phenotypic (dmax) or genetic (emax) divergence. The first metric is the angle between two vectors. The angle between the i-th eigenvector of the A6140 G-matrix, gi, and δp is defined as:

(3) Θ=180πcos1(δpgiδpgi).

As both gi and -gi are eigenvectors of the G-matrix, Θ values between 90° and 180° were transformed so that Θ always remains between 0° and 90° (Θ=180°-Θ, results from using -gi instead of gi in Equation 3). Angles comparing the alignment of the ancestral gi with the axis of phenotypic and genetic divergence were calculated, by replacing δp in Equation 3 with dmax and emax, respectively.

For each angle, we sampled the posterior distribution of the A6140 G-matrix to create a credible interval. δp and dmax were obtained as the first eigenvectors of the SSCP matrices from the MANOVA model, as described above. The null expectation for Θ is calculated as the angle between 1000 pairs of random vectors sampled from a uniform distribution U7(1,1).

The second metric computes the proportion of ancestral genetic variance along the main canonical trait of ancestral phenotypic plasticity:

(4) r=δpTGδpδp2

where pmax is replaced by dmax when computing the proportion of ancestral genetic variance in the main canonical trait of phenotypic divergence in high salt, or by emax when relative to the main canonical trait of genetic divergence in high salt.

Π is the ratio between the amount of genetic variance in r that maximizes plasticity, phenotypic or genetic divergence, over the maximum possible amount of genetic variance in any phenotypic dimension (λgmax, the first eigenvalue of the G-matrix):

(5) Π=rλgmax

Π values are comprised between 0 (no genetic variance along the plasticity/divergence canonical traits) and 1 (when the plasticity/divergence canonical traits contain all the genetic variance in gmax). The null distributions for Π were obtained by randomizing 1,000 ancestral G-matrices through shuffling inbred line and assay block identities.

Selection differentials and gradients

Request a detailed protocol

Selection differentials were estimated above with the ancestral matrix Gqw in high or low salt as the genetic covariance between transition rates and body size with self-fertility in high salt (sk). Comparing observed and expected responses to selection was done by estimating directional selection gradients using Lande’s retrospective equation, equation 9 in Lande, 1979. This is unlike Stinchcombe et al., 2014 or Hajduk et al., 2020, where phenotypic selection gradients were obtained by regression of fitness onto the traits, following Lande and Arnold, 1983. In our case, genetic selection gradients on each transition rate were defined as:

(6) βg=G-1sk

and the phenotypic selection gradients as:

(7) β=G-1Δqk

The G-matrix of the ancestral population was assumed constant during experimental evolution. The credible intervals of both selection gradients were estimated by sampling the posterior distribution of the G-matrix, assuming fixed high salt sk in Equation 6 or fixed Δq¯k for each replicate population in equation 7. We have also obtained credible intervals for βg by sampling the G-matrix and the posterior distribution of sk.

Whether selection theory predicts the direction of phenotypic evolution amounts to an alignment between expected and observed phenotypic divergence. We thus calculated the angle (as above, Θ) between the selection differentials on transition rates and body size in high or low salt (sk), with the observed phenotypic divergence in high or low salt (Δq¯k). The null expectations for the angle were obtained by calculating the angles between 1000 pairs of random vectors sampled from a uniform distribution U7(1,1). Similarly, whether selection theory predicts the magnitude of phenotypic evolution in high salt can be calculated as the ratio of observed phenotypic divergence over selection differentials (Δq¯k/sk). We sampled the posterior distribution of G-matrix for these comparisons to obtain credible intervals.

Contrasts between posterior distributions

Request a detailed protocol

The ‘significance’ of the posterior mode estimates are based on its overlap with the posterior null distribution of the posterior modes (Walter et al., 2018). For all comparisons of posterior distributions, significance can be inferred when their 83% credible intervals do not overlap (Austin and Hux, 2002), assuming homoscedasticity.

Archiving

Request a detailed protocol

Self-fertility data has been previously published by Chelo et al., 2019, and locomotion behavior data in low salt for the ancestral population in Mallard et al., 2022. New data (adaptation, locomotion behavior and body size in high salt), R code, and modeling results are in our GitHub repository and will be archived in a public repository upon publication.

Appendix 1

Appendix 1—key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Strain, strain background (C. elegans, male and hermaphro dite)A6140DOI: 10.1186/s12915-014-0093-1A6140Ancestor, outbred population
Strain, strain background (C. elegans, hermaphro dite)A6140L#DOI: 10.1534/ genetics.117.300406A6140L#A6140 inbred lines
Strain, strain background (C. elegans, male and hermaphro dite)GA150DOI: 10.1186/ s12915-014-0093-1GA150Outbred population
Strain, strain background (C. elegans, hermaphro dite)GA150L#DOI: 10.1534/genetics.117.300406GA150L#GA150 inbred lines
Strain, strain background (C. elegans, male and hermaphro dite)GA250DOI: 10.1186/s12915-014-0093-1GA250Derived from A6140
Strain, strain background (C. elegans, hermaphro dite)GA250L#DOI: 10.1534/genetics.117.300 406GA250L#GA250 inbred lines
Strain, strain background (C. elegans, male and hermaphro dite)GA450DOI: 10.1186/ s12915-014-0093-1GA450Outbred population
Strain, strain background (C. elegans, hermaphro dite)GA450L#DOI: 10.1534/genetics.117.300406GA450L#GA250 inbred lines
Strain, strain background (C. elegans, hermaphro dite)EEV1402DOI : 10.1038/ncomms3417EEV wormbase lab line 1402A6140 inbred line with GFP transgene ccIs4251
Software, algorithmMTWDOI: 10.1038/nmeth.1625--
Software, algorithmRhttp://www.Rproject.org-version 3.3.2
Software, algorithmRStanhttp://mc-stan.org/-R package version 2.18.2
Software, algorithmstatshttps://www.Rproject.org/-R package version 3.3.2
Software, algorithmlme4doi: 10.18637/jss.v067.i01-R package version 1.1-32
Software, algorithmemmeansdoi:10.1080/00031305.1980.10483031-R package version 1.7.1-1
Software, algorithmmultcomp10.1002/bimj.200810425-R package version 1.4-23
Software, algorithmmsmDOI: 10.18637/jss.v038.i08.-R package version 1.7
Software, algorithmMCMCgl mmDOI: 10.18637/jss.v033.i02.-R package version 2.34
Software, algorithmR scriptshttps://github.com/ExpEvolWormLab/Mallard_Robertson-This paper

Data availability

New data, R code for analysis and modeling results is freely accessible and can be found at https://github.com/ExpEvolWormLab/Mallard_Robertson (copy archived at Mallard and Teotonio, 2023).

The following previously published data sets were used

References

  1. Software
    1. Lenth RV
    (2021)
    emmeans: estimated marginal means, Aka least-squares means
    R Package Version 1.7.1-1.
    1. Merilä J
    2. Sheldon BC
    3. Kruuk LE
    (2001)
    Explaining stasis: microevolutionary studies in natural populations
    Genetica 112–113:199–222.
    1. Phillips P
    2. McGuigan K
    (2006)
    Evolutionary Genetics: Concepts and Case Studies
    Evolution of genetic variance-covariance structure, Evolutionary Genetics: Concepts and Case Studies, Oxford University Press.
  2. Software
    1. R Development Core Team
    (2018) R: A language and environment for statistical computing
    R Foundation for Statistical Computing, Vienna, Austria.
  3. Book
    1. Robertson A
    (1968)
    The Spectrum of Genetic Variation
    Syracuse University Press.
  4. Book
    1. Simpson GG
    (1944)
    Tempo and Mode in Evolution
    New York: Columbia University Press.
  5. Software
    1. Stan Development Team
    (2018)
    Rstan: the R interface to Stan
    R Package Version 2.18.2.
  6. Book
    1. Stiernagle T
    (1999)
    Maintenance of C. elegans
    Oxford: Oxford University Press.
    1. Teotónio H
    2. Manoel D
    3. Phillips PC
    (2006)
    Genetic variation for outcrossing among Caenorhabditis elegans isolates
    Evolution; International Journal of Organic Evolution 60:1300–1305.
    1. Teotónio H
    2. Rose MR
    3. Proulx S
    (2009)
    Phenotypic Plasticity of Insects
    Phenotypic plasticity and Evolvability: an empirical test with experimental evolution, Phenotypic Plasticity of Insects, Science Publishers, Inc, 10.1201/b10201-22.
  7. Book
    1. Wagner G
    (2001)
    The Character Concept in Evolutionary Biology
    Academic Press.
  8. Book
    1. Walsh B
    2. Lynch M
    (2018)
    Evolution and Selection of Quantitative Traits
    Sinauer Associates, Inc.

Decision letter

  1. Jacqueline Sztepanacz
    Reviewing Editor; University of Toronto, Canada
  2. Molly Przeworski
    Senior Editor; Columbia University, United States
  3. Greg Walter
    Reviewer; Monash University, Australia

Our editorial process produces two outputs: (i) public reviews designed to be posted alongside the preprint for the benefit of readers; (ii) feedback on the manuscript for the authors, including requests for revisions, shown below. We also include an acceptance summary that explains what the editors found interesting or important about the work.

Decision letter after peer review:

Thank you for submitting your article "Selection and the direction of phenotypic evolution" for consideration by eLife. Your article has been reviewed by 3 peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by Molly Przeworski as the Senior Editor. The following individual involved in the review of your submission has agreed to reveal their identity: Greg Walter (Reviewer #3).

The reviewers have discussed their reviews with one another, and the Reviewing Editor has drafted this to help you prepare a revised submission.

Essential revisions:

1. Include body size as a trait in the analyses (these data seem to have already been collected, but not included).

2. Describe and justify what the movement traits represent biologically and why they would be adaptive in a high-salt environment.

3. Justify/explain why the movement traits can be considered independent traits and are not simply multiple measurements of the same phenotype.

4. Provide evidence that the fitness of these populations increased throughout this experiment.

5. Provide evidence and clarification of the method used to determine statistical support for genetic variance in G (showing that G has more than 1 significant genetic dimension would help provide support for #3).

6. Justify why heritability is an appropriate scale of measurement, rather than raw genetic variances or mean-scaled genetic variance.

7. Revise the introduction to provide targeted background on the key questions addressed.

8. Re-write the methods and results to improve clarity.

Reviewer #1 (Recommendations for the authors):

Line 63: I think you mean phenotypically correlated, not genetically correlated.

Line 68: the equation shown does not explicitly consider non-linear selection.

Line 90: under the infinitesimal model G is not predicted to change and the framework should generally work well across generations.

Line 141: the described methods do not match what is said in the rest of the paper. These methods say that the evolved populations were evolved in 8mM of salt for 35 generations and then kept at 305mM of salt for 15 generations. The rest of the paper implies that the concentration of salt gradually increases throughout the selection experiment. Which is it?

Line 160: line-mean fitness values are estimated with error. Are these errors incorporated into the models?

Line 235: how informative was the prior that was used, and how sensitive are the estimates to changing the prior? It seems that there are not enough genetic degrees of freedom to estimate all of the parameters in the model from the data.

Line 245: how did you estimate the null distributions? This is not explained.

Line 255: it's not obvious that the null expectation is 45. Could you show this empirically using bootstrapping to generate the null distribution.

Line 257: 1/6 of the total genetic variance is not the appropriate "null" distribution. Due to sampling error alone, the genetic variance will decline exponentially across the eigenvectors of G (see McGuigan and Blows 2015; Sztepanacz and Blows 2017).

Line 295: This does not seem like the correct comparison because of the potential for founder effects when establishing GA[1,2,4].

Tables and Figures: it looks like heritability and not genetic variances are reported. This is an important distinction

Reviewer #2 (Recommendations for the authors):

In addition to the previous comments of support for the overall question addressed, and the breadth of data available, but concerns over the evidence of adaptive evolution and multivariate nature of the traits, I had several further requests for clarification or comments about the interpretation of evolutionary genetic concepts.

Overall, I found the logical arguments presented in the introduction hard to follow, and what the study aimed to address was a surprise to me when I arrived at the end.

– Consider a general restructuring of the information to make it clearer from the start what the major focus of the study is.

– A lot of ground is covered very quickly, leaving the reader to fill in some big gaps from their own knowledge – consider whether all topics are sufficiently important to introduce up front, or if you can focus on the key ones to set the stage.

– Adding to the challenge of following the logic of the study motivation is that each individual sentence presents multiple, interconnected ideas. Some simple editorial changes that limited each sentence to one or two ideas I think would really help.

More specifically:

– Line 45: why does plasticity need to align with axes of genetic variance for adaptation to occur? The orientation of G and selection still matters – which you seem to also swing back to in the final sentence. It's not clear to me what this entire paragraph adds to the argument already made other than that phenotypic plasticity may alter the selection a population experiences when the environment changes.

– Line 53: I disagree that an adaptive argument is the most plausible or parsimonious here. The simpler explanation is that the G and environment are channeled through the same developmental pathways, and thus generate similar variation (i.e., Cheverud's argument for P=G due to alignment of G and E).

– Line 58: I don't understand how you arrive at the conclusion that when plasticity isn't adaptive it means we've misunderstood selection. Why is a presumption that plasticity must be adaptive warranted?

– Line 62: This is a misunderstanding of selection and evolution. Genetic correlations have no relevance for estimating selection, which acts solely on the phenotypic variation, irrespective of the causes of that variation. Genetic correlations will impact response to selection (i.e., evolution), but not selection itself.

– Why does it matter if evolution is due to selection acting directly on a trait or due to the genetic correlation of that trait to fitness (or another trait contributing to phenotypic differences in fitness)? How does whether the selection is direct or indirect have any bearing on whether plastic responses are adaptive? The genetic variance in plastic responses? Whether G is aligned with the direction of selection? How G evolves?

Some further requests for clarification of information:

– Equation 1 (and subsequent models): what is "CG"?

– Why are plasticity and divergence modeled for each trait individually? G is estimated from a multivariate model, so why are plasticity and divergence compiled as a vector of individual estimates?

– Equation 2 and 3: what is "Div".

– Equation 6: why is there a single intercept for the 6 traits? Why would traits not have their own individual intercept? Centering the data prior to analysis will not preclude traits differing in intercept when the fixed effects are taken into account.

– Line 243: why is e11 defined as the dimension with the most divergence?

– Line 248: "strictly" not "strickly".

– Line 251: Please explain what Gqw is – the 13 trait G?

– Equation 8: what is this "G" / where does it come from? How much of a bias is introduced here by taking the inverse of a G where many dimensions have very low variance (only positive due to imposed constraints)?

– Line 256: given the size of your experiment, what is the expected variation (error) around 45 degrees for two unrelated vectors?

– Second line after Equation 10 – add "genetic" for clarity: "…maximum amount of genetic variance …".

– Table 2 please also provide the error estimates.

– Figure 2: Could you please provide further information on the approach employed by the cited Morrissey and Bonnet 2019 for establishing null expectations? I might have missed this in the methods, but the evidence that you actually have genetic variance is key to all conclusions in the paper, so being very clear about this evidence is important.

– Bottom panel in B -I presume the vertical line indicates 0, but the use of red here and for high salt is confusing.

– Line 281: It's not clear what you mean by "modular" – the definition that I am familiar with relates to the strength of correlation among one set of traits relative to their correlation with other traits, but does not depend on a shared direction of correlation (i.e., a module contains both positively and negatively associated traits, so long as those correlations are relatively strong). The observation that one (maybe 2) axes capture all variation suggests a single module (i.e. a single behavioral syndrome has been measured).

– Line 284: what is "rounder" and how do you determine it's not "important"?

– Figure 3: Where does the null expectation come from? Please report the actual numbers for observed and null (including CI – in text or table) as it is not possible to see the overlap on this small plot on the plot. The conclusion that phenotypic plasticity is aligned with one G and not the other is very strong given that it does not seem that you can actually tell any difference between the two estimates (beyond sampling error alone).

– Tables 3 and 4. I have no idea what is being shown here – please provide sufficient information to relate this back to the Methods and the models that were fit, and what null hypothesis the reported Χ2 is associated with.

– Line 328: what's the logic for concluding that genetic variance in fitness indicates a stressful environment? Perhaps such conclusions are better placed in the Discussion where they might be justified via further information.

– Line 368: where does the information on gene expression come from? How is gene expression restricted to active/still? Do you mean that there is only divergence in expression between these states? How does this explain why leaving the stationary state has a different sign of divergence from entering the stationary state or changing direction? These observations seem consistent with my earlier interpretation that there is a single behavioral trait being assessed here, not six.

– Line 377: how statistically robust are the estimates of mutational variance? Where there is no variance, zero covariance will be implicated. The strength of this evidence also speaks to the question of whether there is a single "trait" being assessed here.

– Line 383: will only facilitate adaptation if they are aligned with future directions of selection.

– Line 450: why do you expect that changes in locomotor behavior will be under direct selection here? If the environment was heterogeneous for salinity (or food), then there may be fitness benefits, but how does moving any particular way allow the worms to increase their survival or fecundity in a high-salinity environment? Again, how do these results reflect a response to selection versus neutral evolution?

– Line 480: more details on where these data on size, and these estimates are coming from would be appreciated.

Reviewer #3 (Recommendations for the authors):

While I think these data are very valuable and their results are likely robust, I found that some parts of the manuscript were difficult to follow. In particular, it would help if the methods and results could be described a bit more clearly. In my major points below I highlight the areas I struggled to follow, and provide some suggestions for how to better focus the conceptual framework.

I think the authors need to better refine the conceptual framework and clarify the hypotheses in the final paragraph of the introduction. My confusion is because I'm not sure if they are focussing on adaptive phenotypic responses to the novel salt environment (i.e. plastic and evolved movement towards an optimum phenotype), or to predict the direction of phenotypic evolution. If the former, then I think testing whether plasticity in the novel environment is non-adaptive or adaptive is important, and testing whether phenotypic evolution has occurred in the direction of gmax or phenotypic selection (i.e. the phenotypic optimum) is also important. However the focus seems to be the latter, which means that there are currently no hypotheses justifying the test of the amount of genetic variance in the direction of plasticity, and furthermore, direct vs indirect selection is not explicitly compared. Below I provide some suggestions on how the conceptual framework could be clarified, I hope these comments help.

1) The framing in the introduction could be improved. In particular, the first paragraph jumps from selection on multiple traits to mutation-selection balance and alignment with the selection surface. I found some of the sentences quite long and it took several reads to understand their meaning. Furthermore, I found that plasticity is not well-grounded in the conceptual framework throughout the introduction and is not closely linked to the focus of the manuscript (predicting phenotypic evolution). L.46 assumes that plasticity is adaptive in a novel environment, which is often not the case. If the authors want to test the role of plastic responses in persisting and then adapting to the novel environment, I think they need some way of quantifying adaptive vs non-adaptive plasticity as well as testing whether adaptation occurs in the direction of plasticity (I found some information in supplementary material, but the link with the main idea of predicting phenotypic evolution is not clear). If it is possible to estimate a phenotypic selection gradient, they could test whether plasticity is adaptive by how well it aligns with phenotypic selection. However, this is a slightly different topic to predicting phenotypic evolution, which I think is the main focus of the manuscript. Furthermore, the second paragraph ends with 'indicating that selection is often misunderstood', which is a little vague and does not connect plasticity to phenotypic evolution.

2) Adaptation is important for this study, but evidence that the evolved populations adapted to the high salt environment is not presented. On lines 114 and 411 there is a reference to another study, but I think the manuscript would benefit from a more detailed explanation (or some discussion) about adaptation to the stressful environment, especially if the proxy for fitness (fertility) did not show evidence of adaptation (L.411).

3) There is missing information in the methods.

a. What are the traits (transition rates)? What do they represent and how are they important for the salt environment? There is some information L.180-193, but there is no biological explanation of what was measured or why these traits were chosen. It is also stated elsewhere that these traits are independent, could the authors please clarify how they are independent given they seem to use some of the same information in their calculation? I have trouble understanding how (for example) SF is different from FS in Figure 1.

b. Is the data collected from the same experiment that is a reciprocal transplant of all populations in all treatments? The section describing fitness (L.155) makes it sound like they were from different experiments. Furthermore, the use of BLUPs as estimates of fitness (from another experiment) is worrying, as explained by Hadfield et al. (2010; https://doi.org/10.1086/648604). If this is the case, it would be better to estimate the additive genetic covariance between traits and fitness (rather than the BLUPs). This is stated later in the manuscript, but I'm confused about where estimates of fitness came from and how they were used. I apologise if I've misunderstood the methods.

c. L.246 the authors describe a null distribution, but how was this constructed? Morrissey et al. (2019; https://doi.org/10.1111/evo.13842) made an important suggestion for more conservative estimates of null G (also described in Hangartner et al. 2020; https://doi.org/10.1111/evo.13891).

4) The description of the statistical analyses is sometimes difficult to follow. There are (by necessity) many parameters estimated and some more justification throughout the methods would help.

a. Some clarity would help in the description of equations 1-3 as it is difficult to understand what the motivation is for each equation, or what they represent. Also, equations 4-5 could be easier to understand by including (in text) the definition of a plasticity vector from Noble et al. (2019): δ X = Xnn – Xnov, where X is the mean of each trait in the nonnovel and novel environments.

b. L.257 and Equation10-12: I think equation 10 represents the amount of genetic variance in the direction of plasticity, but as a proportion of total plasticity (i.e. the length of the plasticity vector), is this correct? If so, please clarify in the text. A minor point on L. 257: this equation represents the amount of genetic variance in the direction of plasticity, which is different to genetic variance in plasticity. I am also confused by equations 10-11, why not just calculate the proportion of genetic variance in the direction of maximum evolvability (i.e. denominator in equation 10 is the 1st eigenvalue of G) – with plasticity vector normalised to unit length so that it represents the direction of plasticity. This removes the need for equation 11 and gives you the same information: the amount of genetics in the direction of plasticity as a proportion of the direction of maximum genetic variance. I apologise again if I've misunderstood, but I think a more detailed description and justification for equations 10-11 would help.

c. Equation 12: I don't think this is the correct null expectation. Why would we expect plasticity to be in the direction of 'average' genetic variance? I would think a better null would be random vectors (of unit length) projected through G so that you would test whether plasticity is in a direction that describes greater genetic variance than expected by random sampling.

d. Equation 5: Is this calculated for each of the derived populations? This is important for understanding the results in Figure 5. Because they are independent populations (with G also estimated separately), to me it would make more sense to do pairwise comparisons for each derived population with the ancestral (which could be summarised in Figure 5).

e. Figure 6 and 7 (+ their interpretation): To test how to predict phenotypic evolution, I think it would be better to directly compare the distribution of β and the selection differentials. By comparing them separately it is not clear what hypothesis is being tested and the argument is verbal rather than quantitative – see Hajduk et al. (2020; https://doi.org/10.1098/rstb.2019.0359) for a nice example.

5) In the discussion, I found it very interesting that the authors found body size to be both genetically correlated to the movement traits, and that selection was predicted accurately (both sg and β). I am curious as to why this trait wasn't included in the analyses because, as the authors highlight, it is probably the trait selection is operating on (or at least provides an estimate of performance as outlined by the traditional Lande and Arnold selection analyses). It made me wonder how body size changed across treatments and whether it evolved in the high salt treatment. I would think that it would be important to include body size in the analysis.

6) The discussion could be a bit more concise. I also think alternative explanations need to be discussed as I'm not sure I agree with the interpretation on L.425. It is more parsimonious (or at least a viable alternative) that during adaptation genetic variance has been depleted as would be expected if the selection is strong. Another alternative would be that selfing in a stressful environment has reduced the amount of genetic variance. I think it could be worth including these alternative explanations.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Selection and the direction of phenotypic evolution" for further consideration by eLife. Your revised article has been evaluated by Molly Przeworski (Senior Editor) and a Reviewing Editor.

Two reviewers and I have reviewed the substantial changes made to the manuscript since its previous submission. We are all in agreement that the current version adequately addressed previous concerns and is largely improved. However, the manuscript would benefit from some additional minor revisions in writing to help improve the clarity of this technically and biologically complex paper. I do not expect the changes should take the authors very long to complete.

There are several suggestions from the reviewers for ways to improve clarity. Addressing the premise of the study in a clearer and more biologically motivated way, and clarifying the methods and results for testing indirect versus direct selection are particularly important.

Reviewer #2 (Recommendations for the authors):

The responses from the Authors are comprehensive, and I believe have adequately addressed concerns about analyses and interpretation, as well as improving clarity and accessibility of the paper. In particular, the Authors have clarified the approach taken to defining a null distribution against which observed parameter estimates are compared. The shuffling of data and estimation of G from multiple shuffled samples characterized random associations in the data. The Authors have also included further empirical data that provides evidence that the three experimental populations had adapted to high salt.

Throughout the manuscript there is a clear commitment to transparency of data and analyses, with access to data and additional results to support the main results reported.

Reviewer #3 (Recommendations for the authors):

The authors present an interesting test of whether experimental evolution can be predicted from the patterns of genetic variation in the ancestral population. The authors have done an impressive job to address the comments raised in the previous review, which means that the paper is much stronger and focused on the questions addressed. I appreciated their detailed responses to the questions raised in the previous review and I found the new version greatly improved.

I only have one comment: The writing, while comprehensive, is quite dense and often abstract throughout the paper. This will make it difficult for a broader audience to follow. In addition, because of the (necessary) statistical rigor, it felt like the biology is missing, especially in the setup of the study and the results. For example, terms such as 'canonical traits' and references to alignments in the first paragraph make it difficult for a more broader audience to understand the background of the study, or the gap in knowledge that is being addressed.

I provide some specific examples below, but suggest that revising should focus on clarity throughout (and on the biological rather than statistical importance):

– L29-33 it seems early in the introduction to introduce G and Lande's equation, and it does so with little biological foundation.

– L.13 a better way of saying 'follow their selection gradient' would be 'whether phenotypic evolution occurs in the direction of selection'.

– L.14 'the canonical trait of the multivariate phenotype' this is not broadly accessible. Perhaps 'trait combinations with large amounts of genetic variation' (or something similar) would be more intuitive.

The premise of the study is much clearer but could be clarified further. The paragraph in the introduction L.115-116 needs to be simplified as it is difficult to follow and just lists the contrasts in a complicated way. Instead, it would be better to include the tests and hypotheses, rather than 'characterising' adaptation. The final two sentences are confusing (L.123-127), they should more clearly outline the focus of the study. In particular, I'm not sure what the final sentence is saying: are you testing whether genetic selection gradients in the ancestral population = phenotypic selection gradients after adaptation? I would have thought you would compare genetic selection gradients with phenotypic divergence between the ancestral and adapted populations.

I liked the addition of the explanation of the traits in the introduction, but is it possible to add some more biology? L.111-112 '… which ones [traits] are genetically or environmentally independent' is a bit vague. Do you instead have an idea of how the movement traits help them to navigate low vs high salt environments? Or some other more biological reasoning. It is suggested 'movement can increase during experimental evolution due to more foraging and dwelling' – but this is confusing, is it that there is selection for greater movement across generations? Or just that there is greater movement in a new environment? Foraging and dwelling seem like opposites unless dwelling is defined as a specific behaviour.

I found the major tests of indirect vs direct selection and also how phenotypic evolution is predicted very difficult to follow in the results. I think the analyses are correct, but the results need to be described more simply so that it is easier to understand. I'm not sure the section on direct selection is required, and if removed, could help simplify the study to focusing on predicting phenotypic evolution. Or there needs to be a better justification for including direct selection because the Stinchombe (2014) and Hajduk et al. (2020) framework does not seem to have been used (but I apologise if I've misinterpreted something). Specifically, is there a reference for equation 7? Why does this Β represent phenotypic selection when it seems to capture phenotypic divergence. I am also a little confused as to why selection differentials are not just compared to observed divergence. And if you want to include direct vs indirect, predicted evolutionary change with indirect selection could be calculated using δ Z = G*Β where Β is calculated as per equation 6 – this uses the Stinchcombe (2014) framework. Sorry again if I've misinterpreted something, but there seems to be a disconnect between the results as written, the figures and the interpretation seems contradictory in the abstract (L.15-16).

Overall I found the study very interesting, I hope my comments help.

https://doi.org/10.7554/eLife.80993.sa1

Author response

Essential revisions:

1. Include body size as a trait in the analyses (these data seem to have already been collected, but not included).

We have included body size in all the analyses as the seventh trait (the other six being the transition rates between movement states). We obtain our estimates separately for each of the two environments (the low salt domestication environment and the high salt novel environment). To estimate selection differentials in the ancestral population, we estimate a Gmatrix with 8 traits (Gqw-matrix), including transition rates, body size and self-fertility. As suggested by the reviewers, we now include multivariate analyses of phenotypic variation (MANOVA). We keep, however, univariate phenotypic analysis as they allow us to estimate mean values for each inbred line (for visualization purposes) and also test the significance of phenotypic divergence for each of the replicate evolved populations. Multivariate and univariate approaches give similar results however.

2. Describe and justify what the movement traits represent biologically and why they would be adaptive in a high-salt environment.

We have one paragraph in the Introduction that describes and justifies why transition rates and body size might be relevant.

3. Justify/explain why the movement traits can be considered independent traits and are not simply multiple measurements of the same phenotype.

This is also justified in the Introduction (and Methods, equation 2). In the Results section we show that genetically, there are 3 canonical traits for locomotion behavior (eigenvectors that explain a significant amount of ancestral genetic variance). We now also provide a summary Table 3, which relates the main canonical traits of standing genetic variation with those of ancestral phenotypic plasticity, of phenotypic and genetic divergence and of mutational covariance that we recently published (Mallard et al. G3, 2023). Finally, in the Discussion, one paragraph justifies why more than one trait can be independent given standing variation, mutation and selection.

4. Provide evidence that the fitness of these populations increased throughout this experiment.

We did new experiments to show adaptation to the high salt environment. The results confirm adaptation after 50 generations of evolution, for all replicate populations (Figure 5).

5. Provide evidence and clarification of the method used to determine statistical support for genetic variance in G (showing that G has more than 1 significant genetic dimension would help provide support for #3).

Please see the Methods section: we repeatedly shuffled the phenotypic dataset by line and block identity to obtain a null posterior distribution of the genetic variance modes at each iteration. Whenever this approach was used we label it in orange (figures), so hopefully it will be clearer for the reader. We believe that our sampling procedure addresses the problem you worked on in Sztepanacz and Blows (Genetics, 2017). The first reason is that to obtain uncertainties we compared the G-matrix with the distribution of posterior modes of random Gmatrices. The second reason is that we are also asking the question of whether genetic variances in the observed eigentraits of the ancestral population are different from null expectations, and thus conditional on the how the traits are genetically structured. For this, we projected the random G-matrices onto the observed eigenvectors (Figure 2, supplement Figure 3). Both approaches return similar results for the expected null distribution. Please let us know if you think we are wrong so that we can correct it. Also note that we do sample the posterior of the empirical G-matrix when estimating the selection gradients (Figure 9).

6. Justify why heritability is an appropriate scale of measurement, rather than raw genetic variances or mean-scaled genetic variance.

There was a misunderstanding. We did not present heritability estimates nor we do it now. We only present estimates of genetic (co)variances throughout. As suggested by reviewer #1 we now assess different prior distributions when estimating the G-matrix (Figure 2, figure supplement 1). The inclusion of body size data forced us to also consider scaling issues: we have opted for multiplying body size data by 50 in order to be on a similar phenotypic scale as the transition rates in movement states.

7. Revise the introduction to provide targeted background on the key questions addressed.

We have completely revised the Introduction, hopefully it will read much better. We are nonetheless somewhat terse when considering short-term versus long-term evolutionary dynamics, though we would like to keep it that way because we think it is important to understand how theory assumptions, and existing comparative evidence, are limited.

8. Re-write the methods and results to improve clarity.

Done. We have also included several subsection headings to improve clarity. Please see reply to the reviewers where we detail new methods, analyses and results.

Reviewer #1 (Recommendations for the authors):

9. Line 63: I think you mean phenotypically correlated, not genetically correlated.

Thank you. Corrected.

10. Line 68: the equation shown does not explicitly consider non-linear selection.

Thank you. Corrected.

11. Line 90: under the infinitesimal model G is not predicted to change and the framework should generally work well across generations.

Indeed. Corrected.

12. Line 141: the described methods do not match what is said in the rest of the paper. These methods say that the evolved populations were evolved in 8mM of salt for 35 generations and then kept at 305mM of salt for 15 generations. The rest of the paper implies that the concentration of salt gradually increases throughout the selection experiment. Which is it?

It is the first regime. We have throughout tried to make it explicit.

13. Line 160: line-mean fitness values are estimated with error. Are these errors incorporated into the models?

This is indeed a concern. The inbred self-fertility (fitness) are estimated with error but it was not incorporated in any model. We now provide an additional analysis to test whether there is a consistent bias in the genetic covariance estimates between traits and fitness. We generated for each line a fertility value randomly sampled from a normal distribution centered on the observed line mean and presuming that the initial set of individuals used to estimate fitness was randomly split in two groups (multiplying by two the expected variance) as we have two technical replicates per line for phenotyping. We show that the mean estimates of selection differentials are robust to within line fertility variation (Figure 4, supplement figure 1).

14. Line 235: how informative was the prior that was used, and how sensitive are the estimates to changing the prior? It seems that there are not enough genetic degrees of freedom to estimate all of the parameters in the model from the data.

We provide more details on the prior used to compute the G-matrices (flat improper prior). We further show the ancestral G-matrix estimates using different priors, including parameter expanded priors (Figure 2, supplement figure 1). Overall, ancestral G-matrix estimates are robust to varying prior distributions, particularly the covariance estimates. For the variance estimates we find that when using the Inverse-Wishart distribution as a prior with a high “degree of belief”, only body size has a lower value than when using other priors. It is unclear to us if this is much of an issue for subsequent analysis. We do, however, conclude that are enough degrees of freedom to model G-matrices (see also replies regarding how uncertainties are found).

15. Line 245: how did you estimate the null distributions? This is not explained.

The null distributions involving the randomization of line and block IDs, reestimation of the G-matrix, and eigen decomposition where appropriate, are shown with orange labels. In the methods we explain the approach in the Methods and then in the figure captions when necessary.

16. Line 255: it's not obvious that the null expectation is 45. Could you show this empirically using bootstrapping to generate the null distribution.

We had trouble understanding the null of this angle as well and really appreciate the reviewer's input as the null expectation is not 45°. It will only be 45° in two dimensions and progressively approaches 90° as the number of dimensions increases. We now generate a null distribution using bootstrapping as you suggest. We failed to find an appropriate reference for this problem, or the ones we found were mathematically challenging. Please see Methods.

17. Line 257: 1/6 of the total genetic variance is not the appropriate "null" distribution. Due to sampling error alone, the genetic variance will decline exponentially across the eigenvectors of G (see McGuigan and Blows 2015; Sztepanacz and Blows 2017).

We agree that the genetic variance will decline exponentially even with randomly generated trait values (though it will occur in random directions). We now provide a null which is based on the randomized G-matrices obtained by shuffling line and block IDs. In Figure 2, supplement figure 3 we also present a null conditional on the observed eigentraits of the ancestral population. The observed values are lower than both kinds of nulls which strengthen our conclusion that 3 traits are genetically orthogonal in the ancestral population regardless of the salt environment. Please see also reply to point #5.

18. Line 295: This does not seem like the correct comparison because of the potential for founder effects when establishing GA[1,2,4].

There were very limited founder effects as the GA replicates were split equally at sample sizes of at least the effective population size in low salt, from the same unique ancestor population (A6140). If anything there was reduced genetic variance in the ancestor, before deriving the replicates, as at the time of phenotyping it had spent more time under cryogenesis. We do not wish to detail potential problems with our protocols as their relevance for experimental evolution can be found in our previous work (see for example, Guzella et al. PlosGenetics 2018 for the consequences of drift and bottlenecks on adaptation to high salt).

19. Tables and Figures: it looks like heritability and not genetic variances are reported. This is an important distinction

No heritability estimates were presented, only genetic variances.

Reviewer #2 (Recommendations for the authors):

I have several requests for clarification or comments about the interpretation of evolutionary genetic concepts.

Overall, I found the logical arguments presented in the introduction hard to follow, and what the study aimed to address was a surprise to me when I arrived at the end.

20. Consider a general restructuring of the information to make it clearer from the start what the major focus of the study is.

21. A lot of ground is covered very quickly, leaving the reader to fill in some big gaps from their own knowledge – consider whether all topics are sufficiently important to introduce up front, or if you can focus on the key ones to set the stage.

22. Adding to the challenge of following the logic of the study motivation is that each individual sentence presents multiple, interconnected ideas. Some simple editorial changes that limited each sentence to one or two ideas I think would really help.

We revised the whole manuscript, and particularly the text in the Introduction. We have removed the paragraph on phenotypic plasticity (and now only interpret the phenotypic divergence and adaptation in light of ancestral phenotypic plasticity in the Discussion). The focus of the study is on the predictability of multivariate trait evolution, and thus on Lande’s equation when using selection differentials to estimate the directional selection gradients.

More specifically:

23. Line 45: why does plasticity need to align with axes of genetic variance for adaptation to occur? The orientation of G and selection still matters – which you seem to also swing back to in the final sentence. It's not clear to me what this entire paragraph adds to the argument already made other than that phenotypic plasticity may alter the selection a population experiences when the environment changes.

We’ve eliminated the introductory paragraph on phenotypic plasticity.

24. Line 53: I disagree that an adaptive argument is the most plausible or parsimonious here. The simpler explanation is that the G and environment are channeled through the same developmental pathways, and thus generate similar variation (i.e., Cheverud's argument for P=G due to alignment of G and E).

We agree, but no longer refer to Cheverud’s original idea as we do not wish to discuss standing environmental stochastic variation, nor its evolution. We have little power to do so, and, more importantly, our phenotyping design and modeling not allow us to separate the effects of random environmental covariantes (error) from those of within-genotype stochastic variance. In our populations, locomotion behavior appears to be phenotypically and genetically similarly aligned (not shown). This question also relates to how many genetically independent traits exist, in the sense that adaptive phenotypic evolution could also have been limited by g2 or g3 but again we lack power to test their influence. The result that the selection gradient does not appear to align with neither standing genetic variation or mutational variance (Table 3) further suggests that there are fundamental developmental or physiological constraints. We now discuss this in the context of phenotypic plasticity and the evolution of novelty during adaptation, though without explicitly referring to Cheverud’s idea.

25. Line 58: I don't understand how you arrive at the conclusion that when plasticity isn't adaptive it means we've misunderstood selection. Why is a presumption that plasticity must be adaptive warranted?

The sentence was eliminated. Please see Discussion. Ancestral phenotypic plasticity is “non-adaptive” and only reveals the topography of the adaptive landscape.

26. Line 62: This is a misunderstanding of selection and evolution. Genetic correlations have no relevance for estimating selection, which acts solely on the phenotypic variation, irrespective of the causes of that variation. Genetic correlations will impact response to selection (i.e., evolution), but not selection itself.

Rephrased.

27. Why does it matter if evolution is due to selection acting directly on a trait or due to the genetic correlation of that trait to fitness (or another trait contributing to phenotypic differences in fitness)? How does whether the selection is direct or indirect have any bearing on whether plastic responses are adaptive? The genetic variance in plastic responses? Whether G is aligned with the direction of selection? How G evolves?

Correct. We’ve eliminated introducing plasticity upfront. We consider that a trait is adaptive, regardless of the environment where it was measured, whenever mean trait changes are correlated with adaptation to the target environment (high salt). We do not estimate ancestral genetic variance for plasticity (as it is collinear with variance in one of the environments, assuming linear reaction norms) nor do we explicitly test for the evolution of plasticity. See above replies and Discussion.

Some further requests for clarification of information:

28. Equation 1 (and subsequent models): what is "CG"?

It was referring to “Common Garden” with fixed levels correcting for differences between the years of phenotyping locomotion behavior. We simplified the presentation of the model (Equation 2).

29. Why are plasticity and divergence modeled for each trait individually? G is estimated from a multivariate model, so why are plasticity and divergence compiled as a vector of individual estimates?

We model plasticity and phenotypic divergence in a single model using a multivariate model. See reply to point #1.

30. Equation 2 and 3: what is "Div".

Deleted.

31. Equation 6: why is there a single intercept for the 6 traits? Why would traits not have their own individual intercept? Centering the data prior to analysis will not preclude traits differing in intercept when the fixed effects are taken into account.

The phenotypic data are not centered and each trait has its own intercept (µ is a column vector). We center and standardize the design and environmental covariates.

32. Line 243: why is e11 defined as the dimension with the most divergence?

We have replaced the eigentensor analysis with a random skewers approach, as it simplifies the interpretation of “genetic divergence”. Please see Methods, and Figure 7, panel C. Briefly, both the random skewers approach and the previous eigentensor approach (now presented only in our GitHub repository) compare the four G-matrices and describe differentiation among them. Eigen decomposition of the (co)variances matrices for differentiation reveals that there is only one canonical trait explaining most variation which is due to differences of the evolved replicate populations to the ancestor population. For this reason, we call this canonical trait the dimension of most divergence (and not differentiation).

33. Line 248: "strictly" not "strickly".

Corrected.

34. Line 251: Please explain what Gqw is – the 13 trait G?

Following reviewer #3 suggestion, we estimate Gqw using the 6 transition rates and body size measured in high salt or low salt, together with the high salt fertility values. There are therefore 8 traits for each Gqw.

35. Equation 8: what is this "G" / where does it come from? How much of a bias is introduced here by taking the inverse of a G where many dimensions have very low variance (only positive due to imposed constraints)?

Thanks for this comment. All the trait variances are significantly above the null expectations (the randomized G matrices; Figure 2, supplement figure 2), but clearly two traits have very low variance (FS, BS). To address if taking the inverse of these variances leads to a bias in the selection gradients estimates, we repeated the analysis but without including those two traits (red in Author response image 1). Furthermore, we also did the analysis per trait, ignoring all other traits (green in Author tesponse image 1). There is no apparent bias as the original estimates with all traits included (gray) are similar when including or not the two low variance traits. The univariate approach (ignoring covariances between traits) obviously results in similar estimates as the selection differentials (compare Author response image 1 with Figure 4 in main text). We agree that the selection gradient for FS and BS are likely to be biased as a large proportion of the observed genetic variance is also detected in the randomized G-matrices. Yet, in the multivariate case, the selection gradients are not different from zero and their low values seem to be associated with an increased CI. This analysis in Figure 9, supplement figure 1.

Author response image 1
Selection gradients estimated with all 7 traits (gray, as shown in Figure 9 of the manuscript), a subset of 5 traits (red) or separately (green) using ββgg = GG-1sk.

CI are all obtained from sampling the posterior of the ancestral G-matrix and using fixed selection differentials.

36. Line 256: given the size of your experiment, what is the expected variation (error) around 45 degrees for two unrelated vectors?

We replaced this null expectation as mentioned in the response to reviewer #1 (reply #23). We produced random expectations based on the sampling of pairs of vectors from a uniform distribution.

37. Second line after Equation 10 – add "genetic" for clarity: "…maximum amount of genetic variance …".

Thank you.

38. Table 2 please also provide the error estimates.

Table 2 has been replaced by several tables with the results of the MANOVA and univariate analysis. Errors for each replicate population response could only be obtained from univariate models (used just for plotting in Figure 6). The comparison of each replicate with the ancestral population is shown in the Figures 1 and 6 as well as in supplementary tables. For the MANOVA analysis, each trait is contrasted between replicate populations and the ancestral population (Figures 1 and 6, lines; and supplementary tables). The major factor results are presented in Table 2 (the full results as a supplementary table).

39. Figure 2: Could you please provide further information on the approach employed by the cited Morrissey and Bonnet 2019 for establishing null expectations? I might have missed this in the methods, but the evidence that you actually have genetic variance is key to all conclusions in the paper, so being very clear about this evidence is important.

See replies #5, #12, and #18.

40. Bottom panel in B -I presume the vertical line indicates 0, but the use of red here and for high salt is confusing.

The vertical line is in black in all plots now.

41. Line 281: It's not clear what you mean by "modular" – the definition that I am familiar with relates to the strength of correlation among one set of traits relative to their correlation with other traits, but does not depend on a shared direction of correlation (i.e., a module contains both positively and negatively associated traits, so long as those correlations are relatively strong). The observation that one (maybe 2) axes capture all variation suggests a single module (i.e. a single behavioral syndrome has been measured).

We agree that the use of the term modular was irrelevant. We eliminated this discussion altogether.

42. Line 284: what is "rounder" and how do you determine it's not "important"?

We have rephrased the sentence and eliminated these terms.

43. Figure 3: Where does the null expectation come from? Please report the actual numbers for observed and null (including CI – in text or table) as it is not possible to see the overlap on this small plot on the plot. The conclusion that phenotypic plasticity is aligned with one G and not the other is very strong given that it does not seem that you can actually tell any difference between the two estimates (beyond sampling error alone).

The canonical trait of ancestral phenotypic plasticity is derived from the MANOVA analysis (pmax). Projections of the G matrix on pmax, and the angle between gmax, g2 and g3 with pmax are presented only for the high salt environment (Figure 3). The null for the projections comes from randomizing the G matrix, for the angles from sampling pairs of random vectors. We have added this explanation and the CI values in the figure caption (and also as a supplementary table). Similarly, we have provided all relevant information in Figure 8.

44. Tables 3 and 4. I have no idea what is being shown here – please provide sufficient information to relate this back to the Methods and the models that were fit, and what null hypothesis the reported Χ2 is associated with.

Most of the phenotypic analyses have been modified. We hope that the results are more clearly presented. Please note that all data, tables and scripts for analysis can be found by following the links provided in the figure legends. The analysis and results should be fully reproducible by the reader.

45. Line 328: what's the logic for concluding that genetic variance in fitness indicates a stressful environment? Perhaps such conclusions are better placed in the Discussion where they might be justified via further information.

We no longer use the term “stressful”.

46. Line 368: where does the information on gene expression come from? How is gene expression restricted to active/still? Do you mean that there is only divergence in expression between these states? How does this explain why leaving the stationary state has a different sign of divergence from entering the stationary state or changing direction? These observations seem consistent with my earlier interpretation that there is a single behavioral trait being assessed here, not six.

We have eliminated most of this paragraph.

47. Line 377: how statistically robust are the estimates of mutational variance? Where there is no variance, zero covariance will be implicated. The strength of this evidence also speaks to the question of whether there is a single "trait" being assessed here.

We no longer explicitly refer to pleiotropy and instead only mention that the first two traits with most mutational variance are not the same as the one under directional selection, or as the one with most ancestral standing genetic variation. Please see Table 3. Our estimates of mutational variance are statistically robust (DOI: 10.1093/g3journal/jkac335).

48. Line 383: will only facilitate adaptation if they are aligned with future directions of selection.

Indeed, rephrased.

49. Line 450: why do you expect that changes in locomotor behavior will be under direct selection here? If the environment was heterogeneous for salinity (or food), then there may be fitness benefits, but how does moving any particular way allow the worms to increase their survival or fecundity in a high-salinity environment? Again, how do these results reflect a response to selection versus neutral evolution?

Mechanistically, we don’t know why locomotion behavior would change in high salt. We briefly introduce the rationale to study locomotion, but one of the goals of our study is precisely to find how many traits are environmentally or genetically independent. What we show is that changes in locomotion behavior are adaptive in high salt and result from a response to indirect or direct selection. Further investigation will be necessary to find the “biological” relevance of these mathematically defined traits, in particular by mapping QTLs for the several canonical traits we describe. See also replies #4 and #27, for the adaptive nature of phenotypic evolution.

50. Line 480: more details on where these data on size, and these estimates are coming from would be appreciated.

We now include data on body size in all analyses.

Reviewer #3 (Recommendations for the authors):

51. While I think these data are very valuable and their results are likely robust, I found that some parts of the manuscript were difficult to follow. In particular, it would help if the methods and results could be described a bit more clearly. In my major points below I highlight the areas I struggled to follow, and provide some suggestions for how to better focus the conceptual framework.

I think the authors need to better refine the conceptual framework and clarify the hypotheses in the final paragraph of the introduction. My confusion is because I'm not sure if they are focussing on adaptive phenotypic responses to the novel salt environment (i.e. plastic and evolved movement towards an optimum phenotype), or to predict the direction of phenotypic evolution. If the former, then I think testing whether plasticity in the novel environment is non-adaptive or adaptive is important, and testing whether phenotypic evolution has occurred in the direction of gmax or phenotypic selection (i.e. the phenotypic optimum) is also important. However the focus seems to be the latter, which means that there are currently no hypotheses justifying the test of the amount of genetic variance in the direction of plasticity, and furthermore, direct vs indirect selection is not explicitly compared. Below I provide some suggestions on how the conceptual framework could be clarified, I hope these comments help.

Thank you for the time and effort you took in reviewing our manuscript. We believe to have addressed all your concerns and have followed your suggestions. Please see the new Introduction for an explanation of our focus of study (predicting phenotypic evolution). We do spend 2/3 of the results text, however, describing the ancestral states and evolution.

52. The framing in the introduction could be improved. In particular, the first paragraph jumps from selection on multiple traits to mutation-selection balance and alignment with the selection surface. I found some of the sentences quite long and it took several reads to understand their meaning. Furthermore, I found that plasticity is not well-grounded in the conceptual framework throughout the introduction and is not closely linked to the focus of the manuscript (predicting phenotypic evolution). L.46 assumes that plasticity is adaptive in a novel environment, which is often not the case. If the authors want to test the role of plastic responses in persisting and then adapting to the novel environment, I think they need some way of quantifying adaptive vs non-adaptive plasticity as well as testing whether adaptation occurs in the direction of plasticity (I found some information in supplementary material, but the link with the main idea of predicting phenotypic evolution is not clear). If it is possible to estimate a phenotypic selection gradient, they could test whether plasticity is adaptive by how well it aligns with phenotypic selection. However, this is a slightly different topic to predicting phenotypic evolution, which I think is the main focus of the manuscript. Furthermore, the second paragraph ends with 'indicating that selection is often misunderstood', which is a little vague and does not connect plasticity to phenotypic evolution.

We have re-written the Introduction. We first introduce Lande’s equation to predict phenotypic evolution, then the problem of indirect selection to estimate selection gradients. This is followed by a (necessarily) brief paragraph on why Lande’s equation might not be appropriate to predict phenotypic evolution (changing selection gradients and unstable Gmatrix over several generations) and finally a summary of how we present the experiment, phenotyping, and analyses. We have eliminated the paragraph on phenotypic plasticity and only discuss it at the end of the manuscript.

53. Adaptation is important for this study, but evidence that the evolved populations adapted to the high salt environment is not presented. On lines 114 and 411 there is a reference to another study, but I think the manuscript would benefit from a more detailed explanation (or some discussion) about adaptation to the stressful environment, especially if the proxy for fitness (fertility) did not show evidence of adaptation (L.411).

We performed an additional experiment to address this question. All four populations (ancestral and derived) were pairwise competed with a tester GFP-marked strain in the high salt environment, for one generation. Adaptation is measured by how much the experimental wild-type allele changes in this competition. We find adaptation to high salt (Figure 5).

54. There is missing information in the methods.

a. What are the traits (transition rates)? What do they represent and how are they important for the salt environment? There is some information L.180-193, but there is no biological explanation of what was measured or why these traits were chosen. It is also stated elsewhere that these traits are independent, could the authors please clarify how they are independent given they seem to use some of the same information in their calculation? I have trouble understanding how (for example) SF is different from FS in Figure 1.

We have expanded the methods section to better explain how the transition rates were estimated. We have also added a paragraph to the Introduction addressing the potential biological significance of transition rates in the two salt environments. The transition rates are the probabilities that an individual will change its current state of movement given its state at an earlier time step. Transition rates were independently estimated (as opposed to the self-transition rates which are constrained to be a linear combination of two other transition rates, equation 1). The environmental mean distribution of SF and FS appears similar and they both increase with salt concentration (Figure 1). Genetically, however, they are negatively correlated with each other (Figure 2) and respond to selection in opposite directions (Figure 6).

b. Is the data collected from the same experiment that is a reciprocal transplant of all populations in all treatments?

We have tried to clarify our design in the Methods. Inbred lines from all four populations were assayed at the same time after experimental evolution, using frozen samples. Each “block” corresponds to a different sample thaw and phenotyping. Other environmental covariates were modeled, and importantly grandmaternal and maternal environmental effects were the same in each block. Each block contained several lines, usually from several populations, which were phenotyped in both low salt and high salt. The raw data table and sample size should also make our design clearer (presented as supplementary tables).

55. The section describing fitness (L.155) makes it sound like they were from different experiments. Furthermore, the use of BLUPs as estimates of fitness (from another experiment) is worrying, as explained by Hadfield et al. (2010; https://doi.org/10.1086/648604). If this is the case, it would be better to estimate the additive genetic covariance between traits and fitness (rather than the BLUPs). This is stated later in the manuscript, but I'm confused about where estimates of fitness came from and how they were used. I apologise if I've misunderstood the methods.

Yes, the fertility estimates were obtained in different assays. This is why we have used BLUPs, in order to account for environmental covariates, different time of data collection, different experimenters, etc. As you mention there might be some problems with using BLUPs, though the bias might be to underestimate the mean differences between lines. The estimates of the genetic covariance between traits and fitness are presented in Figure 4, from the Gqw matrix (all 6 transition rates, body size and fertility). These estimates, however, do not include assessing within-line variation in fertility. To address this problem we provide an additional analysis where we vary within-line fertility, to mimic the assay sampling. This is presented in Figure 4, figure supplement 1 (see also reply #20 above). We do not observe much bias.

c. L.246 the authors describe a null distribution, but how was this constructed? Morrissey et al. (2019; https://doi.org/10.1111/evo.13842) made an important suggestion for more conservative estimates of null G (also described in Hangartner et al. 2020; https://doi.org/10.1111/evo.13891).

Randomization of G, see reply #28. We referred to Morrissey et al. when performing the eigentensor analysis, which is now presented as an appendix in our GitHub repository. We did follow Morrissey et al. suggestion.

56. The description of the statistical analyses is sometimes difficult to follow. There are (by necessity) many parameters estimated and some more justification throughout the methods would help.

a. Some clarity would help in the description of equations 1-3 as it is difficult to understand what the motivation is for each equation, or what they represent. Also, equations 4-5 could be easier to understand by including (in text) the definition of a plasticity vector from Noble et al. (2019): δ X = Xnn – Xnov, where X is the mean of each trait in the nonnovel and novel environments.

We are sorry for this. We have modified the phenotypic analysis and the different variables are now, hopefully, better described. We have also included a notation table (Table 1).

b. L.257 and Equation10-12: I think equation 10 represents the amount of genetic variance in the direction of plasticity, but as a proportion of total plasticity (i.e. the length of the plasticity vector), is this correct? If so, please clarify in the text.

It does not represent the genetic variance in the direction of plasticity as a proportion of total plasticity. The equation simply accounts for the fact that the plasticity vector might not be of length one. We have revised this whole section.

57. I am also confused by equations 10-11, why not just calculate the proportion of genetic variance in the direction of maximum evolvability (i.e. denominator in equation 10 is the 1st eigenvalue of G) – with plasticity vector normalised to unit length so that it represents the direction of plasticity. This removes the need for equation 11 and gives you the same information: the amount of genetics in the direction of plasticity as a proportion of the direction of maximum genetic variance. I apologise again if I've misunderstood, but I think a more detailed description and justification for equations 10-11 would help.

Thank you for this comment. We used the framework defined by Noble et al. PNAS 2019 for simplicity but we realized that more explanation was needed. It is not necessary to normalize by the largest eigenvalues as we only compare the empirical values to the null expectation. However, we think that the reader will better understand the metric if it scaled between 0 and 1.

c. Equation 12: I don't think this is the correct null expectation. Why would we expect plasticity to be in the direction of 'average' genetic variance? I would think a better null would be random vectors (of unit length) projected through G so that you would test whether plasticity is in a direction that describes greater genetic variance than expected by random sampling.

As you suggest we have eliminated equation 12, and now only present one null distribution (obtained from randomizing the G-matrix). Note that the projections are of the Gmatrix through the plasticity (or divergence) vectors, and not the other way around.

d. Equation 5: Is this calculated for each of the derived populations? This is important for understanding the results in Figure 5. Because they are independent populations (with G also estimated separately), to me it would make more sense to do pairwise comparisons for each derived population with the ancestral (which could be summarised in Figure 5).

Indeed, these are now what we call δ_q (Table 1). We estimate the difference between each evolved population relative to the ancestral population, using the MANOVA. The results are presented in supplementary data table 2 in Figure 6 (and illustrated with asterisks). We use δ_q, for each evolved replicate population, to estimate the genetic selection gradients (Figure 9).

e. Figure 6 and 7 (+ their interpretation): To test how to predict phenotypic evolution, I think it would be better to directly compare the distribution of β and the selection differentials. By comparing them separately it is not clear what hypothesis is being tested and the argument is verbal rather than quantitative – see Hajduk et al. (2020; https://doi.org/10.1098/rstb.2019.0359) for a nice example.

As suggested, we present a comparison between genetic and phenotypic selection gradients on each trait (Figure 9, panel A); obtained with Lande’s equation and assuming a stable G-matrix over multiple generations (CI are obtained by sampling the G-matrix posterior distribution, see replies above). Note that the estimates of phenotypic selection gradients have larger means and CI than the genetic selection gradients, leading to high replicate heterogeneity. For this reason we would like to continue to present the tests on the direction and magnitude of phenotypic evolution (Figure 9, panels B and C). All tests are quantitative even if not formally defined.

58. In the discussion, I found it very interesting that the authors found body size to be both genetically correlated to the movement traits, and that selection was predicted accurately (both sg and β). I am curious as to why this trait wasn't included in the analyses because, as the authors highlight, it is probably the trait selection is operating on (or at least provides an estimate of performance as outlined by the traditional Lande and Arnold selection analyses). It made me wonder how body size changed across treatments and whether it evolved in the high salt treatment. I would think that it would be important to include body size in the analysis.

We now include body size from the start of the analysis, with additional insights about the evolution of phenotypic plasticity and novelty. We have deleted the discussion about direct versus indirect selection.

59. The discussion could be a bit more concise. I also think alternative explanations need to be discussed as I'm not sure I agree with the interpretation on L.425. It is more parsimonious (or at least a viable alternative) that during adaptation genetic variance has been depleted as would be expected if the selection is strong. Another alternative would be that selfing in a stressful environment has reduced the amount of genetic variance. I think it could be worth including these alternative explanations.

We have tried to reduce and simplify the Discussion. We now include a paragraph on the loss of genetic variance by selection (and eliminated the discussion on overdominance). We also mention the expectations for the loss of genetic variance due to drift under selfing assuming infinitesimal trait inheritance.

[Editors' note: further revisions were suggested prior to acceptance, as described below.]

Reviewer #3 (Recommendations for the authors):

The authors present an interesting test of whether experimental evolution can be predicted from the patterns of genetic variation in the ancestral population. The authors have done an impressive job to address the comments raised in the previous review, which means that the paper is much stronger and focused on the questions addressed. I appreciated their detailed responses to the questions raised in the previous review and I found the new version greatly improved.

I only have one comment: The writing, while comprehensive, is quite dense and often abstract throughout the paper. This will make it difficult for a broader audience to follow. In addition, because of the (necessary) statistical rigor, it felt like the biology is missing, especially in the setup of the study and the results. For example, terms such as 'canonical traits' and references to alignments in the first paragraph make it difficult for a more broader audience to understand the background of the study, or the gap in knowledge that is being addressed.

We want to thank the reviewer for the constructive remarks. We tried in this last version of the manuscript to address all of your comments and suggestions. As surely you agree, explaining every technically in vague or unprecise terms is unfeasible. We further provide numerous references to reviews of the several methods and topics that we address.

I provide some specific examples below, but suggest that revising should focus on clarity throughout (and on the biological rather than statistical importance):

– L29-33 it seems early in the introduction to introduce G and Lande's equation, and it does so with little biological foundation.

We have added two sentences with biological motivation before introducing Lande’s equation in a separate paragraph (line 25).

– L.13 a better way of saying 'follow their selection gradient' would be 'whether phenotypic evolution occurs in the direction of selection'.

Thank you, corrected.

– L.14 'the canonical trait of the multivariate phenotype' this is not broadly accessible. Perhaps 'trait combinations with large amounts of genetic variation' (or something similar) would be more intuitive.

We have re-phrased the sentence and now are explicitly defining what we mean by canonical traits, following your suggestion (line 39).

The premise of the study is much clearer but could be clarified further. The paragraph in the introduction L.115-116 needs to be simplified as it is difficult to follow and just lists the contrasts in a complicated way. Instead, it would be better to include the tests and hypotheses, rather than 'characterising' adaptation. The final two sentences are confusing (L.123-127), they should more clearly outline the focus of the study.

Thank you. We have greatly reduced this paragraph to just include the hypotheses being tested.

In particular, I'm not sure what the final sentence is saying: are you testing whether genetic selection gradients in the ancestral population = phenotypic selection gradients after adaptation? I would have thought you would compare genetic selection gradients with phenotypic divergence between the ancestral and adapted populations.

Please see also the reply below. Briefly, and following your suggestion in the first round of review, we used Hadjuk et al. framework to compare the genetic selection gradient with the phenotypic selection gradient, using in both circumstances Lande’s retrospective equation (which allows us to include G-matrix uncertainties). As mentioned in the reply below, this is different (but complementary) from the comparison of the selection differentials with the observed phenotypic divergence (which was presented in the first version of the manuscript, and continues to be so in Figures 9BC). Hajduk et al. are not explicit in testing for responses to selection when comparing phenotypic with genetic selection gradients (their equation 1.3). They do follow up with using genetic selection gradients to predict phenotypic evolution using Lande’s equation δ_Z = G times Β_g, but this is different from testing for indirect versus direct selection.

I liked the addition of the explanation of the traits in the introduction, but is it possible to add some more biology? L.111-112 '… which ones [traits] are genetically or environmentally independent' is a bit vague. Do you instead have an idea of how the movement traits help them to navigate low vs high salt environments? Or some other more biological reasoning. It is suggested 'movement can increase during experimental evolution due to more foraging and dwelling' – but this is confusing, is it that there is selection for greater movement across generations? Or just that there is greater movement in a new environment? Foraging and dwelling seem like opposites unless dwelling is defined as a specific behaviour.

We are not able to directly connect the change in our locomotion traits with changes in specific behaviors such as foraging, dwelling, sleep, mate search etc.. and we do not want the readers to believe that we can. We have specifically defined a phenotypic space that is general enough to encompass these behaviors without the necessity to describe them in anthropomorphized terms. After all, we only take locomotion behavior traits as a surrogate for any multivariate trait evolution. This said, for example, foraging would be something like preferring forward movement, and dwelling a preference for rapid changes in the direction of movement. We have added a sentence in the introduction to clarify the biological significance of locomotion behavior in our two environments and refer the reader to Fravell et al. 2020 where a discussion about the complexities of behavior locomotion can be found.

I found the major tests of indirect vs direct selection and also how phenotypic evolution is predicted very difficult to follow in the results. I think the analyses are correct, but the results need to be described more simply so that it is easier to understand. I'm not sure the section on direct selection is required, and if removed, could help simplify the study to focusing on predicting phenotypic evolution.

We would like to keep the section on indirect versus direct selection, as predicting multivariate trait evolution crucially depends on being aware of it. Please see next reply.

Or there needs to be a better justification for including direct selection because the Stinchombe (2014) and Hajduk et al. (2020) framework does not seem to have been used (but I apologise if I've misinterpreted something). Specifically, is there a reference for equation 7? Why does this Β represent phenotypic selection when it seems to capture phenotypic divergence. I am also a little confused as to why selection differentials are not just compared to observed divergence. And if you want to include direct vs indirect, predicted evolutionary change with indirect selection could be calculated using δ Z = G*Β where Β is calculated as per equation 6 – this uses the Stinchcombe (2014) framework. Sorry again if I've misinterpreted something, but there seems to be a disconnect between the results as written, the figures and the interpretation seems contradictory in the abstract (L.15-16).

Overall I found the study very interesting, I hope my comments help.

Maybe we have not properly explained our analyses and results. Strictly, we have not used Stinchcombe et al. or Hajduk et al. approaches because they estimate the phenotypic selection gradients (Β) using the Lande-Arnold regression. Instead we use Lande’s retrospective equation to estimate both Β and Stinchcombe’s genetic selection gradients, Β_g. Lande’s retrospective equation (our equations 6 and 7) are equation 9 in Lande 1979, which is now referred to in the methods, line 706 – besides being presented in the introduction.

We interpreted your suggestion in the previous round of reviewing for us to apply Lande’s retrospective equation for Β because this is what is usually done in natural populations (as only divergence data are available) and your suggestion was on how to properly predict phenotypic evolution. As mentioned in the reply above, the explicit test for direct selection in Hajduk proposal is that Β be equal to Β_g, that is that the phenotypic covariance between trait and fitness (Lande-Arnold’s regression) over total phenotypic variance is equal to the genetic covariance between traits’ and fitness

(Robertson’s selection differentials) over the genetic variance for the trait. We agree that our approach appears to confound the question of direct versus indirect selection with predicting phenotypic evolution. However, the test for indirect versus direct selection (the comparison of Β and Β_g, Figure 9A), is presented separately from the test of predicting phenotypic evolution (comparison of observed divergence with selection differentials, Figures 9B and 9C). Overall, we agree that both tests appear to be confounded, here and in the literature.

We also note that our approach considers the uncertainty in the estimation of the ancestral G-matrix for both Β and Β_g (though admittedly, like R. Lande, we assume an unchanging G with evolution). In the paragraph on indirect versus direct selection, we refer to this issue (see also supplementary analysis to Figure 9). Our approach does not seem to have been reported before.

https://doi.org/10.7554/eLife.80993.sa2

Article and author information

Author details

  1. François Mallard

    Institut de Biologie de l’École Normale Supérieure, CNRS UMR 8197, Inserm U1024, PSL Research University, Paris, France
    Contribution
    Conceptualization, Data curation, Software, Formal analysis, Validation, Investigation, Writing - review and editing
    For correspondence
    mallard@bio.ens.psl.eu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2087-1914
  2. Bruno Afonso

    Institut de Biologie de l’École Normale Supérieure, CNRS UMR 8197, Inserm U1024, PSL Research University, Paris, France
    Contribution
    Data curation, Investigation
    Competing interests
    No competing interests declared
  3. Henrique Teotónio

    Institut de Biologie de l’École Normale Supérieure, CNRS UMR 8197, Inserm U1024, PSL Research University, Paris, France
    Contribution
    Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Writing - original draft, Project administration
    For correspondence
    teotonio@bio.ens.psl.eu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1057-6882

Funding

European Research Council (ERC-St-243285)

  • Henrique Teotónio

Agence Nationale pour la Recherche (ANR-14-ACHN-0032-01)

  • Henrique Teotónio

Agence Nationale pour la Recherche (ANR-17-CE02-0017-01)

  • Henrique Teotónio

National Science Foundation (PHY-1748958)

  • Henrique Teotónio

Gordon and Betty Moore Foundation (2919.02)

  • Henrique Teotónio

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank I Chelo, H Gendrot, T Guzella, and L Noble for help with worm handling, data acquisition or data analysis; C Baer, C Dillmann, T Long, L Noble, J Pemberton, P Phillips, S Proulx, J Sztepanacz and P de Villemereuil for discussion; J Sztepanacz, G Walter, and reviewers for suggestions on improving the presentation of this study. Funding: This work was supported by the European Research Council (ERC-St-243285), the Agence Nationale pour la Recherche (ANR-14-ACHN-0032–01, ANR-17-CE02-0017-01), and the KITP Quantitative Biology program (National Science Foundation, PHY-1748958; Gordon and Betty Moore Foundation, 2919.02).

Senior Editor

  1. Molly Przeworski, Columbia University, United States

Reviewing Editor

  1. Jacqueline Sztepanacz, University of Toronto, Canada

Reviewer

  1. Greg Walter, Monash University, Australia

Version history

  1. Preprint posted: May 29, 2022 (view preprint)
  2. Received: June 11, 2022
  3. Accepted: July 14, 2023
  4. Accepted Manuscript published: August 31, 2023 (version 1)
  5. Version of Record published: October 10, 2023 (version 2)

Copyright

© 2023, Mallard et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 836
    Page views
  • 183
    Downloads
  • 2
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. François Mallard
  2. Bruno Afonso
  3. Henrique Teotónio
(2023)
Selection and the direction of phenotypic evolution
eLife 12:e80993.
https://doi.org/10.7554/eLife.80993

Share this article

https://doi.org/10.7554/eLife.80993

Further reading

    1. Computational and Systems Biology
    2. Evolutionary Biology
    Roee Ben Nissan, Eliya Milshtein ... Ron Milo
    Research Article

    Synthetic autotrophy is a promising avenue to sustainable bioproduction from CO2. Here, we use iterative laboratory evolution to generate several distinct autotrophic strains. Utilising this genetic diversity, we identify that just three mutations are sufficient for Escherichia coli to grow autotrophically, when introduced alongside non-native energy (formate dehydrogenase) and carbon-fixing (RuBisCO, phosphoribulokinase, carbonic anhydrase) modules. The mutated genes are involved in glycolysis (pgi), central-carbon regulation (crp), and RNA transcription (rpoB). The pgi mutation reduces the enzyme’s activity, thereby stabilising the carbon-fixing cycle by capping a major branching flux. For the other two mutations, we observe down-regulation of several metabolic pathways and increased expression of native genes associated with the carbon-fixing module (rpiB) and the energy module (fdoGH), as well as an increased ratio of NADH/NAD+ - the cycle’s electron-donor. This study demonstrates the malleability of metabolism and its capacity to switch trophic modes using only a small number of genetic changes and could facilitate transforming other heterotrophic organisms into autotrophs.

    1. Evolutionary Biology
    2. Genetics and Genomics
    Thomas A Sasani, Aaron R Quinlan, Kelley Harris
    Research Article

    Maintaining germline genome integrity is essential and enormously complex. Although many proteins are involved in DNA replication, proofreading, and repair, mutator alleles have largely eluded detection in mammals. DNA replication and repair proteins often recognize sequence motifs or excise lesions at specific nucleotides. Thus, we might expect that the spectrum of de novo mutations – the frequencies of C>T, A>G, etc. – will differ between genomes that harbor either a mutator or wild-type allele. Previously, we used quantitative trait locus mapping to discover candidate mutator alleles in the DNA repair gene Mutyh that increased the C>A germline mutation rate in a family of inbred mice known as the BXDs (Sasani et al., 2022, Ashbrook et al., 2021). In this study we developed a new method to detect alleles associated with mutation spectrum variation and applied it to mutation data from the BXDs. We discovered an additional C>A mutator locus on chromosome 6 that overlaps Ogg1, a DNA glycosylase involved in the same base-excision repair network as Mutyh (David et al., 2007). Its effect depends on the presence of a mutator allele near Mutyh, and BXDs with mutator alleles at both loci have greater numbers of C>A mutations than those with mutator alleles at either locus alone. Our new methods for analyzing mutation spectra reveal evidence of epistasis between germline mutator alleles and may be applicable to mutation data from humans and other model organisms.