1. Evolutionary Biology
Download icon

Coupling adaptive molecular evolution to phylodynamics using fitness-dependent birth-death models

  1. David A Rasmussen  Is a corresponding author
  2. Tanja Stadler
  1. North Carolina State University, United States
  2. ETH Zürich, Switzerland
  3. Swiss Institute of Bioinformatics, Switzerland
Research Article
  • Cited 0
  • Views 675
  • Annotations
Cite this article as: eLife 2019;8:e45562 doi: 10.7554/eLife.45562

Abstract

Beneficial and deleterious mutations cause the fitness of lineages to vary across a phylogeny and thereby shape its branching structure. While standard phylogenetic models do not allow mutations to feedback and shape trees, birth-death models can account for this feedback by letting the fitness of lineages depend on their type. To date, these multi-type birth-death models have only been applied to cases where a lineage’s fitness is determined by a single character state. We extend these models to track sequence evolution at multiple sites. This approach remains computationally tractable by tracking the genotype and fitness of lineages probabilistically in an approximate manner. Although approximate, we show that we can accurately estimate the fitness of lineages and site-specific mutational fitness effects from phylogenies. We apply this approach to estimate the population-level fitness effects of mutations in Ebola and influenza virus, and compare our estimates with in vitro fitness measurements for these mutations.

https://doi.org/10.7554/eLife.45562.001

Introduction

The fitness effects of new mutations is a key determinant of a population’s evolutionary potential to adapt over time. Studies exploring the distribution of fitness effects (DFE) in a wide range of organisms have revealed that, while many mutations are neutral, a smaller but significant fraction have substantial effects on fitness (Sanjuán et al., 2004; Eyre-Walker and Keightley, 2007; Visher et al., 2016). These findings have spurred interest in molecular evolutionary models that consider how non-neutral mutations shape sequence evolution and patterns of genetic diversity. Such models range in complexity from simple models assuming that selection operates uniformly across all sites (Muse and Gaut, 1994; Goldman and Yang, 1994; Yang and Nielsen, 2008) to parameter rich models with site-specific fitness effects (Halpern and Bruno, 1998; Lartillot and Philippe, 2004; Rodrigue et al., 2010; Hilton and Bloom, 2018). While all of these models assume sequences evolve along an underlying phylogenetic tree representing their shared common ancestry, all also assume that the mutation process driving sequence evolution is independent of the other evolutionary processes giving rise to the tree. This independence assumption implies that mutations do not feedback and affect the fitness of lineages in the tree, such that lineages carrying highly beneficial mutations are just as likely to survive and produce sampled descendants as lineages riddled with deleterious mutations (Figure 1A).

Schematic overview of birth-death models.

(A) Standard phylogenetic models assume that there is an underlying process by which individuals replicate and give rise to a phylogeny. Mutations occur along the lineages of the tree, generating the sequence data observed at the tips. The mutation process is assumed to be independent of tree generating process, such that mutations do not impact the branching structure of the tree. (B) The MFBD allows us to relax this assumption, such that mutations at multiple sites feedback and shape both the tree and sequence data. (C) Under the original multi-type birth-death model we track Dn,i(t), the probability density that a lineage n at time t in state i produces the subtree descending from n and the observed tip states. We also track Ei, the probability that a lineage produces no sampled descendants and is therefore unobserved. (D) In the MFBD model we instead track Dn,k,i(t), the probability that a lineage n in state i at site k produces the subtree and the observed tip states at site k. Because the fitness of a lineage fn will depend on its genotype at all sites, we use the marginal site probabilities ω to compute the probability that a lineage has a certain genotype, such as ACT (Approximation 1). We can then marginalize over the fitness of each genotype weighted by its approximate genotype probability to compute the fitness fn of a lineage (Approximation 2). Finally, we need to know the probability En that a lineage left no other sampled descendants, which we approximate using the probability Eu that a lineage with same expected fitness u leaves no sampled descendants (Approximation 3). The schematic in A was reproduced from the original figure by Louis du Plessis (https://github.com/Taming-the-BEAST/TechnicalLectureSources/tree/master/BeastIntro2018) with permission under a Creative Commons license.

https://doi.org/10.7554/eLife.45562.002

While questionable in terms of biological realism, independence between the tree generating process and the mutation process allows for tractable statistical models. Assuming independence, the joint likelihood of a phylogenetic tree 𝒯 and the sequence data 𝒮 at the tips of the tree having evolved as observed can be factored into two distinct components:

(1) L(𝒮,𝒯|μ,θ)=L(𝒮|𝒯,μ)p(𝒯|θ).

The likelihood of the sequence data L(𝒮|𝒯,μ) conditional on the tree and the mutational parameters μ can be computed efficiently for most continuous-time Markov models of sequence evolution (Felsenstein, 1981). The probability density p(𝒯|θ) of the tree 𝒯 given the parameters generating the tree θ can likewise be computed under widely used coalescent (Griffiths and Tavaré, 1994; Pybus et al., 2000) or birth-death models (Rannala and Yang, 1996; Stadler, 2009). In Bayesian phylogenetics, p(𝒯|θ) is normally thought of as the prior distribution over trees rather than a likelihood, because the tree itself is inferred from the sequence data.

The assumption of independence between the mutation and tree generating processes may be unproblematic in certain scenarios, such as if mutations are truly neutral or do not contribute to substantial fitness differences among lineages. A common argument invoked in defense of ignoring non-neutral mutations is that macroevolutionary tree generating processes like speciation and extinction play out on longer timescales than the substitution process fixing or removing mutations within a population (Bustamante, 2005). In this case, fitness variation drives the substitution process within a population but does not ultimately drive the formation of a phylogeny at the species level. But such separation-of-timescales arguments do not hold when segregating mutations contribute to substantial fitness variation between lineages in a phylogeny, such as for rapidly evolving microbes where several different mutant strains can co-circulate. In these cases, the tree generating and mutation processes occur on the same timescale, and the fitness effects of mutations can feedback and shape the branching structure of a phylogeny (Kaplan et al., 1988; Nicolaisen and Desai, 2012; Neher and Hallatschek, 2013). Ignoring non-neutral evolution in this case may introduce biases into phylogenetic inference. But perhaps more importantly, fitness differences among lineages can be correlated with ancestral genotypes, providing information about the molecular basis of adaptive evolution we would otherwise ignore.

We therefore explore an approach that couples molecular sequence evolution to the tree-generating process using multi-type birth-death (MTBD) models. Under this approach, mutations can directly impact the fitness of a lineage in the phylogeny by altering its birth or death rate (Figure 1B). For a single evolving site or other character state, the joint likelihood of the phylogeny together with the observed tip states can be computed exactly under the MTBD model (Maddison et al., 2007; Stadler and Bonhoeffer, 2013; Kühnert et al., 2016). However, this approach is impractical for more than a few non-neutrally evolving sites due to the need to track all possible genotypes as separate types in the state space of the model. We therefore explore an approximate birth-death model that considers how mutations at multiple sites contribute to a lineage’s overall fitness, without the need to track all possible genotypes in sequence space. This approach allows us to infer the fitness effects of individual mutations and the fitness of any particular lineage at any time (based on its inferred ancestral genotype) from the branching structure of a phylogeny. Because our approach is particularly relevant to rapidly adapting microbial pathogens, we apply it to Ebola and influenza virus sequence data in order to quantify the fitness effects of naturally occurring amino acid substitutions.

Materials and methods

The MTBD at a single evolving site

Request a detailed protocol

At a single evolving site, the multi-type birth-death (MTBD) model of Stadler and Bonhoeffer (2013) can be used to compute the joint likelihood L(S,𝒯|μ,θ) of the sequence or character state data 𝒮 and phylogenetic tree 𝒯 in a way that couples the mutation process with changes in fitness along a lineage. Let Dn(t) represent the probability density (i.e. the likelihood) that the subtree descending from lineage n evolved between time t and the present exactly as observed (Figure 1C). Further, let Dn,i(t) represent this probability density conditional on lineage n being in state i out of M possible states at time t. Here the state of a lineage refers to a particular allele or character state (e.g. nucleotide or amino acid) at a single site. We reserve the term genotype to refer to a particular configuration of states across multiple sites in a sequence.

The density Dn,i(t) can be computed going backwards in time from the present (t=0) to time t along a lineage by numerically solving a system of ordinary differential equations:

(2) ddtDn,i(t)=(λi+j=1Mγi,j+di)Dn,i(t)(a)noevent+2λiEi(t)Dn,i(t)(b)birthoflineagewithnosampledescendants+j=1Mγi,jDn,j(t)(c) mutation from i to j

Here, λi is the birth rate and di is the death rate of lineages in state i, and thus reflect a lineage’s fitness. Mutations between states i and j occur at a rate γi,j, independently of birth events. Each term in (Equation 2) describes how Dn,i changes through time by accounting for all of the different events that could have occurred along the lineage. The first term (a) considers the change in probability density given that no birth, death or mutation event occurred. The second term (b) considers the probability of a birth event that went unobserved because one of the child lineages produced no sampled descendants (this event has probability Ei(t), see below). The third term (c) reflects the probability that the lineage mutated from state i to j.

Ei(t) represents the probability that a lineage in state i is not sampled and has no sampled descendants. This probability can be computed at any time t by solving a second set of ODEs:

(3) ddtEi(t)=(1si)di(a)deathwithoutsampling(λi+j=1Mγi,j+di)Ei(t)(b)noevent+λiEi(t)2(c)birth,neitherchildhassampleleddescendants+j=1Mγi,jEj(t).(d)mutationfromitoj

The first term (a) reflects the probability that a lineage dies and is not sampled, where si is the probability that a lineage in state i is sampled upon dying. Terms b-d have similar interpretations as in (Equation 2).

At a tip lineage n, we initialize Dn,i(t)=disi if the lineage was sampled upon death at time t. Alternatively, if n was sampled at the present time t=0 before dying, then Dn,i(t)=ρi, where ρi is the probability that an individual in state i was sampled at present. At a branching event, the probability density Da,i of the parent lineage a in state i giving rise to two descendent lineages n and m is updated as:

(4) Da,i=2λiDm,i(t)Dn,i(t).

The factor of two enters because either lineage m or n could have given birth and we must consider both possible events.

At the root, we can compute the probability density of the entire tree by summing over all possible root states:

(5) Dn=i=1MqiDn,i(troot)1-Ei(troot),

where qi is the prior probability that the root is in state i at time troot. Including the term 1-Ei(troot) in the denominator conditions the birth-death process on giving rise to at least one sampled individual. Dn represents the probability that the entire tree and the tip states 𝒮 evolved as exactly as observed. It is therefore equivalent to the joint likelihood L(𝒮,𝒯|μ,θ) we seek where μ={γ} and θ={λ,d,s}.

In theory, this approach could be extended to evolution at any number of sites as long as we track Dn,i(t) for all possible genotypes i. Unfortunately, this approach has limited utility because the number of possible genotypes in sequence space scales exponentially with the number of sites L (i.e. 4L possible genotypes for nucleotide sequences), making the MTBD model impractical for modeling evolution at more than a few sites.

The marginal fitness birth-death model

Request a detailed protocol

While the fitness of a lineage will generally depend on its genotype across multiple sites, tracking evolution in the space of all possible genotypes is, as just discussed, computationally infeasible. We therefore seek an approach that considers how mutations at multiple sites determine the fitness of a lineage without the need to track Dn,i for all possible genotypes. In the approach described below and outlined in Figure 1D, we therefore track molecular evolution at each site, computing the probability that each site occupies each state, and then approximate the probability of a lineage being in any particular genotype based on these site probabilities. To compute the expected fitness of a lineage, we can then sum, or marginalize, over the fitness of each genotype weighted by its approximate probability. We therefore refer to this approach as the marginal fitness birth-death (MFBD) model.

First, in order to couple a lineage’s fitness with the birth-death process, we will assume that the birth rate λn of any lineage n scales according to the fitness fg of its genotype:

(6) λn=fgλ0,

where λ0 is the base birth rate assigned to a particular reference genotype (e.g. the wildtype). A lineage’s death rate can also be coupled to its fitness, but for simplicity we will assume a lineage’s fitness is reflected only in its birth rate λn.

Let 𝒢 be the set of all possible genotypes in sequence space and gk be the state of genotype g at site k. To make it clear when we are considering evolution in genotype space rather than at a particular site, we will write the probability density Dn,i as Dn,g when i refers to a particular genotype. Furthermore, let Dn,k,i be the probability density of the subtree descending from lineage n given that site k is in state i. By definition,

(7) Dn,k,i={g𝒢:gk=i}Dn,g,

where the sum is over all genotypes in 𝒢 with site k in state i.

We can derive a difference equation for Dn,k,i from Dn,g in a straightforward manner:

(8) Dn,k,i(t+Δt)={g𝒢:gk=i}Dn,g(t+Δt)={g𝒢:gk=i}[(1(fgλ0+j=1M{g𝒢:gk=j}γg,g+d))Dn,g(t)Δt=+2fgλ0En,g(t)Dn,g(t)Δt=+j=1M{g𝒢:gk=j}γg,gDn,g(t)Δt].

Taking the limit as Δt0, we get a new system of differential equations for Dn,k,i(t):

(9) ddtDn,k,i(t)={g𝒢:gk=i}[(fgλ0+j=1M{g𝒢:gk=j}γg,g+d)Dn,g(t)+2fgλ0En,g(t)Dn,g(t)+j=1M{g𝒢:gk=j}γg,gDn,g(t)]

Unfortunately, (Equation 9) would still require us to track Dn,g(t) for all possible genotypes, precisely what we wish not to do. We show below that, if we can approximate fg and En,g for any given lineage, we can write (Equation 9) in terms of only Dn,k,i (see (Equation 19)) and therefore do not need to track each genotype.

Approximating the fitness of a lineage

Request a detailed protocol

We begin by approximating the fitness fn of a lineage n. Even if we do not know the exact genotype of a lineage at a particular time, we can compute the lineage’s expected fitness by summing over the fitness of each genotype fg weighted by the probability ωn,g that lineage n is in genotype g:

(10) 𝔼(fn)=g𝒢fgωn,g.

The same logic can be extended to compute the expected marginal fitness 𝔼(fn,k,i) of a lineage n that at site k is in state i:

(11) 𝔼(fn,k,i)={g𝒢:gk=i}fgωn,g.

Computing 𝔼(fn,k,i) using (Equation 11) requires knowledge of the genotype probabilities ωn,g, which would again require us to track evolution in genotype space. We therefore introduce our major assumption: that we can approximate genotype probabilities using only the marginal site probabilities ωn,k,i that site k is in state i. We describe how we compute ωn,k,i below. For now, we make the approximation that

(12) ω^n,g=k=1Lωn,k,gkg𝒢k=1Lωn,k,gk.

This approximation assumes that all sites evolve independently of one another, which is not generally true because mutations at different sites are linked together in genotypes with shared ancestral histories, creating correlations among sites that we ignore.

Using the approximate genotype probabilities ω^n,g, we can in turn approximate the expected marginal fitness of a lineage:

(13) f^n,k,i={g𝒢:gk=i}fgω^n,g.

If the fitness effects of each site act multiplicatively to determine the overall fitness of a lineage, we can compute f^n,k,i as:

(14) f^n,k,i=σkil=1,lkLj=1Mσljωn,l,j,

where σki is the fitness effect of site k being in state i. This formulation of f^n,k,i is useful if the number of sites L is large and the number of genotypes we need to sum over in (Equation 13) is therefore also extremely large.

Approximating the probability of no sampled descendants

Request a detailed protocol

The En,g(t) term in (Equation 9) represents the probability that a lineage n alive at time t in the past is not sampled and leaves behind no sampled descendants. En,g(t) therefore necessarily depends on the fitness of unobserved lineages descending from n and how fitness along these lineages evolves through changes in their genotype. Because it is often easier track evolution in one dimensional fitness space rather than high-dimensional sequence space (Kepler and Perelson, 1993; Tsimring et al., 1996), we simplify this problem by tracking a proxy for En,g(t) though fitness space.

Let Eu be the probability that a lineage with expected fitness u leaves no sampled descendants. While fitness can take on a continuous range of values, we track these probabilities only for a discrete set of points 𝒱 in fitness space. We can track Eu for u𝒱 by modifying (Equation 3) to obtain:

(15) ddtEu(t)=(1-su)du-(λu+v𝒱γu,v+du)Eu(t)+λuEu(t)2+v𝒱γu,vEv(t).

We can then substitute En,g(t) in (Equation 9) with Eu for the fitness value u closest to fg or f^n,k,i in fitness space.

Tracking evolution in fitness space requires us to specify rates γu,v for how lineages transition between fitness classes u and v. Let 𝒢u be the set of genotypes with expected fitness closest to u out of all fitness values in 𝒱. We approximate γu,v as:

(16) γu,v=1|𝒢u|i𝒢uj𝒢vμij,

where μij is the mutation rate between genotypes i and j. In other words, we compute the average rate of transitions out of fitness class u into v by summing over all possible transitions between genotypes contained within each fitness class. Note that if each genotype falls in a unique fitness class such that |𝒢u|=1 for all u𝒱, then Eu is computed exactly. In the Results, we compare using the approximate transition rates above to compute Eu versus an even simpler approximation where we assume no transitions between fitness classes along unobserved lineages, which has been assumed in earlier multi-type birth-death models (Rabosky et al., 2014; Barido-Sottani et al., 2018).

Computing the marginal site densities Dn,k,i

Request a detailed protocol

Recall that (Equation 9) provided an exact way to track the marginal site densities Dn,k,i based on the genotype densities Dn,g. To efficiently evaluate Dn,k,i without the need to track Dn,g for all genotypes, we apply the three approximations made above. First, we approximate the genotype probabilities ω^n,g based on the marginal site probabilities. Second, we marginalize over the fitness of each genotype (weighted by its genotype probability) to compute f^n,k,i and then substitute f^n,k,i for fg for all genotypes where gk=i below. Third, we approximate En,g by Eu for a single fitness value u closest to f^n,k,i. Making these approximations in (Equation 9) leads to:

(17) ddtDn,k,i(t)={g𝒢:gk=i}[(f^n,k,iλ0+j=1M{g𝒢:gk=j}γg,g+d)Dn,g(t)+2f^n,k,iλ0Eu(t)Dn,g(t)+j=1M{g𝒢:gk=j}γg,gDn,g(t)]

Assuming that the mutation rate from i to j at site k does not depend on the genetic background, we can substitute j=1M{g𝒢:gk=j}γg,g with j=1Mγi,j, where γi,j is the per site mutation rate. We can likewise substitute j=1M{g𝒢:gk=j}γg,gDn,g(t) with j=1Mγi,j{g𝒢:gk=j}Dn,g(t). Making these substitutions and rearranging the sums in (Equation 17), we have:

(18) ddtDn,k,i(t)=(f^n,k,iλ0+j=1Mγi,j+d){g𝒢:gk=i}Dn,g(t)+2f^n,k,iλ0Eu(t){g𝒢:gk=i}Dn,g(t)+j=1Mγi,j{g𝒢:gk=j}Dn,g

Recalling that Dn,k,i={g𝒢:gk=i}Dn,g (and by extension Dn,k,j={g𝒢:gk=j}Dn,g), we have:

(19) ddtDn,k,i(t)=(f^n,k,iλ0+j=1Mγi,j+d)Dn,k,i(t)+2f^n,k,iλ0Eu(t)Dn,k,i(t)+j=1Mγi,jDn,k,j(t).

The significance of (Equation 19) is twofold. First, we can track sequence evolution at each site individually without tracking all genotypes. Second, given f^n,k,i, we can track the overall fitness of a lineage by marginalizing over the fitness effects of all possible mutations at other sites. We can therefore track sequence evolution at each site while simultaneously taking into account the coupled fitness effects of mutations at all other sites on a lineage’s fitness.

Computing f^n,k,i still requires us to approximate the genotype probabilities using (Equation 12), which in turn requires the marginal site probabilities ωn,k,i. In our notation, ωn,k,i represents the conditional probability p(i|𝒯n,𝒮n) that lineage n is in particular state i, where 𝒯n represents the subtree descending from n with tip sequences 𝒮n represents the inverse conditional probability density p(𝒯n,𝒮n|i). We can therefore apply Bayes theorem to compute ωn,k,i given Dn,k,i:

(20) ωn,k,i=p(i|𝒯n,𝒮n)=p(𝒯n,𝒮n|i)q(i)iMp(𝒯n,𝒮n|i)q(i)=Dn,k,iq(i)iMDn,k,iq(i).

The q(i) terms represent the prior probability that the lineage is in state i. Here we make a simplification in assuming that the tree ancestral and sister to lineage n has no information regarding ωn,k,i, and thus assume a uniform prior on q(i)=1/M. The q(i) terms therefore cancel above.

Because the fitness of a lineage depends on the state of all sites, we must solve (Equation 19) for all sites simultaneously as one coupled system of differential equations. This requires updating Dn,k,i at each time step, which suggests the following iterative procedure.

At a tip n observed to be in genotype g, we initialize f^n,k,i as fg if gk=i or else f^n,k,i=0, D^n,k,i=ds or ρ, and ωn,k,i=1 if gk=i, else ωn,k,i=0. Then at each time step backwards through time from time t to time t+Δt, for each site and state we:

  1. Update Dn,k,i by numerically integrating (Equation 19) over time step Δt.

  2. Update the marginal site probabilities ωn,k,i using (Equation 20)

  3. Update the expected marginal fitness values f^n,k,i using (Equation 13) or (Equation 14).

Computing the full joint likelihood

Request a detailed protocol

We can now compute the joint likelihood of the tree and sequence data if we track Dn,k,i at each site back to the root. At the root, Dn,k,i(troot) represents p(𝒯,𝒮k|μ,θ,i), the probability density of the entire tree 𝒯 and the observed sequence data 𝒮k as site k, conditional on site k being in state i at the root. To be precise, Dn,k,i only approximates p(𝒯,𝒮k|μ,θ,i) because we computed Dn,k,i using the expected marginal fitness of a lineage f^n,k,i based on approximate genotype probabilities. We therefore introduce an additional auxiliary variable representing the entire set of expected fitness values f^n,k,i computed over all lineages, sites and states. Using this notation, Dn,k,i(troot)=p(𝒯,𝒮k|μ,θ,,i). By summing over all possible root states at site k (and conditioning on survival), we can then compute:

(21) p(𝒯,𝒮k|μ,θ,)=i=1Mq(i)p(𝒯,𝒮k|μ,θ,,i)1-Eu(troot)=i=1Mq(i)Dn,k,i(troot)1-En,k,i(troot).

Likewise, we can compute the conditional probability density p(𝒮k|𝒯,μ,θ,) of the sequence data at site k given the tree:

(22) p(𝒮k|𝒯,μ,θ,)=p(𝒯,𝒮k|μ,θ,)p(𝒯|μ,θ,).

We already know p(𝒯,𝒮k|μ,θ,) from above but now need the tree density p(𝒯|μ,θ,). This can easily be computed using a birth-death process where the birth rate of each lineage at any time t is always rescaled by its expected fitness f^n(t) contained within .

We can now compute the joint density p(𝒯,𝒮1:L|μ,θ) for all sites. Because each site is conditionally independent of all other sites given , we can factor p(𝒯,𝒮1:L|μ,θ,) into a product of densities for 𝒮k at each site and the density of the entire tree 𝒯:

(23) p(𝒯,𝒮1:L|μ,θ,)=p(𝒯|μ,θ,)k=1Lp(𝒮k|𝒯,μ,θ,).

We can thus approximate the joint likelihood of the sequence data and the phylogeny p(𝒯,𝒮1:L|μ,θ) as p(𝒯,𝒮1:L|μ,θ,). This allows us to consider how selection shapes sequence evolution at each site while simultaneously considering how the fitness effects of mutations at multiple sites act together to shape the phylogeny. As (Equation 23) makes clear though, the goodness of our approximation depends on how well the fitness values in are approximated, which in turn depends on how well we can approximate genotypes based on the marginal site probabilities. We explore the goodness of these approximations in the Results section.

Implementation

Request a detailed protocol

We first implemented the marginal fitness birth-death (MFBD) model in Matlab version R2017b. The Matlab implementation was used to test how well the MFBD model can approximate likelihoods and genotype probabilities relative to the exact multi-type birth death model tracking all possible genotypes for a simple model with only four genotypes. For statistical inference, the MFBD was implemented as an add-on package for BEAST 2 (Bouckaert et al., 2014) named Lumière, which extends the existing BDMM package for multi-type birth-death models (Kühnert et al., 2016). BEAST two is a general software platform that allows a wide range of evolutionary models including birth-death models to be fit to phylogenetic trees while jointly inferring the phylogeny using Bayesian MCMC sampling. The BEAST 2 implementation of Lumière therefore allows the joint posterior distribution of all parameters in the MFBD model and the phylogeny to be estimated from sequence data. Source code for Lumière and the Matlab implementation are freely available at https://github.com/davidrasm/Lumiere.

Simulations

Request a detailed protocol

To test the statistical performance of our approach, mock phylogenies and sequence data were simulated under a birth-death-mutation-sampling process using a variant of the Gillespie stochastic simulation algorithm (Gillespie, 2007) that recorded the ancestry of all individuals in the population. A binary sequence was associated with each lineage and allowed to mutate with a constant per-site mutation rate γ. Mutations could alter the fitness of a lineage by either increasing or decreasing its birth rate according to site-specific fitness effects. At death events, lineages were sampled with probability s, in which case they were included in the mock phylogeny. Code for these simulations is available at https://github.com/davidrasm/Lumiere/tree/master/sim (Rasmussen, 2019; copy archived at https://github.com/elifesciences-publications/Lumiere).

Ebola analysis

Request a detailed protocol

We used the Lumière implementation of the MFBD model to estimate the fitness effects of amino acid mutations previously identified to increase the infectivity of Ebola virus in human cell lines (Diehl et al., 2016; Urbanowicz et al., 2016). We reanalyzed a set of 1610 whole genome EBOV sequences sampled from Guinea, Sierra Leone and Liberia in 2014 to 2016. The sequence alignment along with the time-calibrated molecular phylogeny we used for our analysis were downloaded from https://github.com/ebov/space-time/tree/master/Data.

(Urbanowicz et al., 2016) measured the fitness effects of 17 viral genotypes carrying 18 different amino acid mutations in either single, double or triple mutant backgrounds relative to the Makona genotype first sampled at the beginning of the epidemic. Because our methods cannot estimate fitness effects of mutations at very low frequencies, we only analyzed 9 of these mutations that were present in at least 10 of the 1610 viral samples. Preliminary analysis revealed that these mutations fall within eight unique genetic backgrounds because of the way mutations are nested within other single or double mutant lineages in the phylogeny. Because the data of Urbanowicz et al. (2016) strongly suggest that epistatic interactions between mutations affect viral fitness, we estimated the genotypic fitness fg of these eight major genotypes rather than site-specific fitness effects σ. We therefore used the MFBD to track sequence evolution at each site, but used (Equation 13) to marginalize over these genotypes when approximating the fitness of a lineage.

We estimated the fitness of each genotype relative to the Makona genotype, assuming a uniform [0,2] prior distribution on these fitness values. For the other parameters in the model, we assumed a fixed death or removal rate d of 0.1667 per day based on earlier estimates (Gire et al., 2014; Stadler et al., 2014). Sampling was modeled as occurring upon removal, with the sampling proportion s set to zero before March 2014, when the first sample was collected. After March 2014, we assumed a fixed sampling proportion of 0.056, reflecting the fact that the dataset included samples from 1610 individuals out of the 28,652 probable cases reported by the WHO (WHO, 2016). Lastly, we assumed a constant amino acid mutation rate over all sites with an exponential prior on both the forward and backward mutation rate with a mean rate of 2 × 103 per site per year. We also ran a second analysis where we included the geographic locations of lineages (Guinea, Sierra Leone and Liberia) as an additional evolving character state in our model. In this analysis, we estimated the effect of geographic location on transmission rates in Sierra Leone and Liberia relative to the base transmission rate in Guinea. Both analyses can be reproduced in Lumière with the XML input files available at https://github.com/davidrasm/Lumiere/tree/master/ebola.

Influenza H3N2 analysis

Request a detailed protocol

We used the Lumière implementation of the MFBD model to estimate the fitness effects of amino acid mutations in the hemagglutinin (HA) protein of human influenza virus subtype H3N2. In order to ensure our fitness estimates were directly comparable to the mutational fitness effects previously estimated by Lee et al. (2018), we focused our analysis on viral samples in the same antigenic cluster as the A/Perth/16/2009 strain studied by Lee et al. for two reasons. First, Lee et al. (2018) showed that the fitness effects of amino acid mutations in HA vary depending on the genetic background, with greater fitness differences between more divergent strains. We therefore only considered strains with low genetic divergence from A/Perth/16/2009. Second, the deep mutational scanning experiments were performed in cell culture, and therefore do not reflect the antigenic component of viral fitness in the human population. Only considering a single antigenic cluster therefore minimizes the effect of antigenic mutations.

To further minimize additional background variation in fitness due to geography, we only considered samples collected in the United States from January 2009 to the end of 2012. Overall, we downloaded 2150 sequences from the Influenza Research Database (https://www.fludb.org/) that met these criteria. Nucleotide sequences of the HA segment were aligned in Muscle (Edgar, 2004) and a maximum likelihood phylogenetic tree was estimated in RAxML (Stamatakis, 2014) using a GTR + Gamma substitution model. To get a time-calibrated phylogeny, branch lengths in the ML tree were converted into units of real calendar time with Least Squares Dating v0.3 (To et al., 2016) using a previously estimated molecular clock rate for the HA segment of H3N2 of 5.72 x 10−3 substitutions per site per year (Rambaut et al., 2008).

In our first analysis, we estimated mutational fitness effects from the H3N2 phylogeny under the MFBD model assuming that fitness effects are multiplicative across sites, as in (Equation 14). Because of the large number of naturally occurring mutations in the HA sequences, we limited our analyses to the 17 most abundant amino acid mutations that were present in more than 10% of the sampled sequences. To compare our estimates of population-level fitness effects to fitness effects measured in vitro, we converted the relative amino acid preferences at each site from the deep mutational scanning experiments to mutational fitness effects:

(24) θk=log2πk,iπk,l,

where πk,i is the relative preference for amino acid i at site k. To compute these fitness effects, we used the averaged relative amino acid preferences reported in Dataset S3 of Lee et al. (2018).

In our second analysis, we used the relative preference data from the deep mutational scanning experiments to predict the population-level fitness of viral lineages. For this analysis, we considered all naturally occurring mutations in the HA protein that were present in at least 10 samples. In all, the fitness effects of 67 mutations distributed across 56 sites were included. To map relative amino acid preferences across multiple sites to population-level fitness, we assume that the mutational fitness effects computed from the relative amino acid preferences are additive on a log2 scale, such that the fitness fn of a lineage is:

(25) fn=(1+αkLlog2πk,iπk,l)κ.

Here, α is a linear scaling term that allows us to calibrate population-level fitness in terms of the sum of the site-specific fitness effects. We also include the scaling exponent κ to account for curvature in the fitness landscape, as might be expected to arise if mutations interact globally through synergistic (κ>1) or antagonistic (κ<1) epistatic effects across sites (Elena et al., 2010). A complete list of the HA mutations considered, their fitness effects predicted by DMS and the XML input file needed to reproduce our analysis are available at https://github.com/davidrasm/Lumiere/tree/master/influenzaH3N2.

Results

The four genotype model

We first consider a simple model of molecular evolution in order to compare the marginal fitness birth-death (MFBD) model against the exact multi-type birth-death (MTBD) model tracking all genotypes. Specifically, we consider a binary evolving sequence of length L=2 where all mutations are deleterious and carry a selective fitness cost σ. Fitness effects of individual mutations act multiplicatively, such that the double mutant has fitness (1-σ)2. With this simple model, it is therefore possible to track the evolutionary dynamics of all four genotypes (𝒢={00,01,10,11}) under both models.

Figure 2A shows a phylogeny simulated under the four genotype model, colored according to the genotype of each lineage. We computed the joint likelihood that this tree and observed tip genotypes evolved under a range of different fitness values σ for both the exact MTBD and approximate MFBD models (Figure 2B). The likelihood profiles under both models peak around the true value of σ and closely match at lower values of σ, but begin to diverge at higher values. The probability of a single hypothetical lineage being in each genotype approximated under the MFBD model is also shown against the exact genotype probabilities computed under the MTBD in (Figure 2C).

Performance of the MFBD approximation under the four genotype model.

(A) Simulated phylogeny showing the genotype of each lineage through time. (B) Joint likelihood of the phylogeny and tip genotypes under different values of σ using the the approximate MFBD (solid line) or the exact MTBD model (dotted line). The vertical blue line marks the true parameter value. (C) The normalized probability of a single hypothetical lineage being in each genotype back through time based on the MFBD approximation (solid line) versus the exact MTBD model (dotted line) with σ=0.5. Note that the probabilities for genotypes 10 and 01 are identical. All parameters besides σ were fixed at λ=0.25, d=0.05, s=0.05. The mutation rate γ was symmetric between forward and backwards mutations and fixed at 0.05.

https://doi.org/10.7554/eLife.45562.003

Because the MFBD approximates the probability of a lineage being in each genotype based on the marginal sites probabilities, we also compared how well the MFBD model approximates the genotype probability densities Dn,g relative to the exact multi-type birth-death model. Recall that Dn,g provides the probability that the subtree descending from a lineage n has evolved exactly as observed, and therefore forms the foundation of all likelihood calculations under our model. Averaged over all genotypes, the error introduced by approximating Dn,g under the MFBD model is greatest at intermediate mutation rates (Figure 3A). When there is no selection (σ=0), the MFBD introduces no error, but the error increases as the strength of selection increases (Figure 3B). We can also consider a variant of the four genotype model where each of the single mutant genotypes is neutral with σ=0 but an epistatic interaction between the two sites causes the double mutant to be deleterious with some fitness cost ϵ. Again, the error introduced by the MFBD grows as the strength of the epistatic fitness effect increases (Figure 3C).

The error introduced by approximating genotype probabilities under the MFBD model.

(A–C) The error introduced by approximating the genotype probability densities Dn,g based on the marginal sites probabilities under the MFBD model for different mutation rates (A), strengths of selection (B), and epistatic fitness effects (C). The solid line represents the MFBD approximation with fitness effects coupled across sites whereas the dashed line represents a more naive approximation that ignores the fitness effects of other sites entirely. The mean error represents the time-integrated average over all genotypes. (D–F) Normalized Dn,g probabilities for a single hypothetical lineage being in each genotype back through time based on the MFBD approximation (solid line) versus the exact MTBD model (dotted line). Each plot shows the dynamics of Dn,g for the parameter values marked by asterisks in the plots immediately above. Other parameters are fixed at λ=0.25, d=0.05, s=0.05.

https://doi.org/10.7554/eLife.45562.004

Taken together, these results suggest that the MFBD model introduces error in Dn,g by ignoring correlations among sites due to the fact that selection acts at the level of genotypes, especially when epistasis is strong. The additional correlations between sites induced by selection then causes the genotype probabilities to deviate from those expected based on the marginal site probabilities. Conversely, at very high mutation rates, correlations between sites quickly break down so that sites evolve effectively independently of one another, such that the error introduced by the MFBD also decreases as the mutation rate becomes very high.

Overall, the magnitude of the error introduced by approximating the genotype probabilities is small, especially when we can compare the MFBD model against a more naive approximation that tracks sequence evolution at each site completely independent of all other sites by setting the expected marginal fitness f^n,k,i=σ instead of using (Equation 14). This approximation completely ignores how the fitness of a lineage depends on mutations at other sites, and the error in Dn,g is generally considerably greater than under the MFBD model (Figure 3A–C; dashed lines). Moreover, even when the error introduced by the MFBD model is relatively large, the model still tracks the dynamics of Dn,g backwards through time along a lineage well (Figure 3D–F; for parameter values marked by the black asterisks in A-C).

The MFBD model also approximates En, the probability that a lineage has no sampled descendants, using a discretized fitness space and is therefore another source of potential error. Mirroring the results for Dn,g, the error introduced by this approximation peaks at intermediate mutation rates while it increases monotonically with the strength of selection and epistatic fitness effects (Figure 4A–C). Interestingly, tracking how lineages transition between fitness classes in fitness space does not improve the approximation relative to simply ignoring changes in fitness along unobserved lineages (Figure 4A–C; dashed lines). The overall magnitude of error introduced by approximating En is also small, although using a discretized fitness space does lead to some jaggedness in the dynamics of En (Figure 4D–F). However, only when selection is very strong (σ>0.8) does tracking En in fitness space result in significant errors, and then only in the more distant past (Figure 4E). In this case, a lineage’s fitness in the distant past may be a poor predictor of its probability of leaving sampled descendants at a time point in the distant future because the fitness of the lineage and its descendants may greatly change over time in a way that is difficult to predict without considering the exact mutational pathways through which a lineage can move in sequence space.

The error introduced by approximating the probability of no sampled descendants.

(A–C) The error introduced by approximating En in a discretized fitness space under the MFBD model for different mutation rates (A), strengths of selection (B), and epistatic fitness effects (C). The solid line represents the approximation where lineages are allowed to transition between fitness classes whereas the dashed line represents the assumption that fitness does not change along unobserved lineages. To obtain a single En value comparable across both models, we summed En over all genotypes weighted by the exact probability of the lineage being in each genotype and then took the time-integrated average to compute the mean error. (D–F) The dynamics of En for a single hypothetical lineage back through time based on the MFBD approximation (solid line) versus the exact MTBD model (dotted line). Each plot shows the dynamics of En for the parameter values marked by asterisks in the plots immediately above.

https://doi.org/10.7554/eLife.45562.005

Estimating site-specific fitness effects

Next, we simulated phylogenies under a model where the fitness effect of the mutant allele at each site is drawn independently from a distribution of fitness effects (DFE) in order to test how well we can estimate site-specific fitness effects. Because there can be considerable uncertainty surrounding these fitness effects, we now estimate the posterior distribution of fitness effects using Bayesian MCMC. The accuracy and precision of the estimated fitness effects varies considerably across sites, as shown for a representative phylogeny with five evolving sites in Figure 5.

Estimating site-specific fitness effects.

(A) A phylogeny simulated under a model with five evolving sites each with a random fitness effect. The lineages are colored according to the number of mutations they carry (blue = 0; yellow = 5). The distribution of fitness effects was assumed to be LogNormal with a mean of 0.85 and a standard deviation of 0.32. (B) Site-specific fitness effects estimated using the marginal fitness BD model. Red lines indicate the posterior median and 95% credible intervals. Blue lines mark the true fitness effect at each site.

https://doi.org/10.7554/eLife.45562.006

In order to better understand this variability, we simulated 100 phylogenies with randomly drawn fitness effects at either 2, 5 or 10 evolving sites. Overall, the estimated posterior median fitness effects are well correlated with their true values, although the strength of this correlation decreases as the number of sites increases (Figure 6A–C). Coverage of the 95% credible intervals on the other hand increased from 71.0 to 72.8% to 77.4%.

Inference of site-specific fitness effects from simulated phylogenies.

(A–C) Correlation between the true and estimated posterior median fitness effects for phylogenies simulated with 2, 5 or 10 evolving sites. Results are aggregated over 100 simulated phylogenies, with each point representing an estimate for a single site and phylogeny. The points are colored according to the frequency of the mutant allele among sampled individuals in the phylogeny. (D) Fitness effects estimated under the exact MTBD model tracking all four possible genotypes for the same two site simulations as in A. (E) Correlation between the site-specific fitness effects estimated under the approximate MFBD and exact MTBD for the two site simulations. (F) Error and uncertainty in estimated site-specific fitness effects across all 2, 5, and 10 site simulations. Error was calculated as the posterior median estimate minus the true fitness effect. Uncertainty was calculated as the standard deviation of the posterior values sampled via MCMC. In all simulations, sites where the Effective Sample Size of the MCMC samples was below 100 (less than 5% of all sites across simulations) were discarded. The death rate was fixed at d=0.05 but the birth, mutation and sampling rates were randomly drawn for each simulation from a prior distribution: λ Uniform(0.1,0.2); γ Exponential(0.01); s Uniform(0,1). Only the birth rate was jointly inferred with the site-specific fitness effects.

https://doi.org/10.7554/eLife.45562.007

While there is no systematic directional bias, fitness effects are underestimated for sites at which the mutant allele is at low frequency among sampled individuals and overestimated for sites where the mutant allele is at high frequencies. This however appears to be an intrinsic feature of estimating fitness effects from the branching structure of a phylogeny, as the same phenomena is observed under the exact MTBD model with two sites and four genotypes (Figure 6D), and the estimates made under the approximate MFBD model are highly correlated with estimates made under the exact MTBD model (Figure 6E).

Across all sites and simulations, accuracy decreased when the mutant allele at a given site was at low or high frequencies, and there was considerably more uncertainty for sites where the mutant allele was at very low frequencies (Figure 6F). Thus, while the MFBD model generally performs well at estimating site-specific fitness effects, the accuracy and precision of these estimates varies greatly depending on the frequency of a given mutation in a phylogeny.

Ebola virus adaptation to humans

The Ebola virus glycoprotein (GP) binds to cells during viral cell entry and is therefore thought to be a key determinant of viral fitness in different hosts. Previously, (Urbanowicz et al., 2016) analyzed a large set of naturally occurring amino acid mutations in the GP isolated from patients during the 2013–16 epidemic in Western Africa. The effect of these GP mutations on fitness were then experimentally determined using infectivity assays in cell culture. Several mutant genotypes dramatically increased viral infectivity relative to the Makona genotype isolated during the earliest stages of the epidemic. However, the effect of these mutations on viral transmission and fitness at the host population level have not yet been determined. We therefore applied the MFBD model to a large dataset of 1610 Ebola virus (EBOV) genomes sampled during the 2013–16 epidemic to infer the population-level fitness effects of these GP mutations.

We analyzed 9 out of the 18 amino acid mutations analyzed by Urbanowicz et al. (2016) that were present in at least 10 of the 1610 viral samples. These nine mutations fall in eight different genetic backgrounds or genotypes (Figure 7). Because Urbanowicz et al. (2016) found evidence for epistatic interactions between several of these mutations, we estimated the fitness of these eight genotypes rather than site-specific mutational fitness effects. Table 1 shows the relative fitness of these genotypes estimated at the population-level versus their fitness in cell culture.

Table 1
Estimated posterior median fitness and 95% CI for the Ebola GP mutants relative to the Makona genotype
https://doi.org/10.7554/eLife.45562.008
GenotypeSample freqBase modelModel + geo effectsEffect in cell culture
Makona0.0361.001.00Reference genotype
A82V0.7201.05 (1.04–1.07)1.26 (1.19–1.35)Increases infectivity 2X
P330S0.0020.98 (0.82–1.14)1.11 (0.96–1.24)Decreases infectivity
P330S+N107D+G480D0.0371.04 (0.98–1.12)1.27 (1.16–1.39)Increases infectivity > 2X
A82V+R410S0.0441.09 (1.00–1.18)1.31 (1.17–1.45)No or small effect
A82V+R410S+K439E0.0351.14 (1.01–1.26)1.36 (1.20–1.54)Increases infectivity 2-3X
A82V+R29K0.0191.06 (0.93–1.19)1.27 (1.10–1.45)Increases infectivity 2-3X
A82V+T230A0.0261.03 (0.93–1.11)1.23 (1.10–1.37)Increases infectivity 2-3X
A82V+I371V0.0671.03 (0.98–1.09)1.24 (1.14–1.35)Increases infectivity 2-3X
Relative fitness of Ebola virus genotypes circulating during the 2013–16 epidemic in Western Africa.

Ancestral fitness values were reconstructed by first finding the probability of a lineage being in each possible genotype based on the marginal site probabilities computed using (Equation 20). Ancestral fitness values were then computed by averaging the posterior median fitness of each genotype, weighted by the probability that the lineage was in each genotype. Fitness values are given relative to the Makona genotype isolated at the start of the epidemic. Clades are labeled according to their most probable genotype.

https://doi.org/10.7554/eLife.45562.009

Mapping the genotypes and fitness of lineages inferred under the MFBD model onto the phylogeny allows us to reconstruct the series of events by which EBOV adapted to humans (Figure 7). Shortly after the epidemic started in 2013, the A82V mutation occurred and gave rise to lineage B, which then spread to Sierra Leone, Liberia and Mali. (Urbanowicz et al., 2016) found that the A82V mutation increases infectivity by 2–3 fold in cell culture. At the population-level, this mutation appears to have a less dramatic effect, increasing transmissibility by only 5% relative to the Makona genotype. The P330S mutation appears to have temporarily decreased the fitness of the main surviving clade in lineage A, although mutations N107D and G480D later rescue the fitness of this lineage, consistent with the findings of Urbanowicz et al. (2016). Meanwhile, the R410S mutation occurred within lineage B but did not have an immediate effect on fitness. However, R410S appears to epistatically interact with mutation K439E, which occurs twice along the same lineage carrying the R410S mutation and in this genetic background increases infectivity 2–3 fold in cell culture. We estimate that the A82V+R410S+K439E genotype had the highest population-level fitness, but only increased fitness by 14% relative to the Makona genotype. Three other mutations, R29K, T230A and I371V, also occurred in the A82V genetic background, but were not estimated to have further increased the fitness of the A82V genotype.

Because the A82V mutation occurred along a lineage that spread from Guinea to Sierra Leone and several of the genotypes we considered were also geographically restricted, we performed a second analysis to check whether our estimates of genotype fitness were confounded by geographic differences in transmission rates. In this model, we accounted for geographic effects by including location (Guinea, Sierra Leone or Liberia) as an additional evolving character state or ‘site’ in the model. We found no evidence that transmission rates differed by location; relative transmission rates were 1.01 (95% CI: 0.97–1.05) in Sierra Leone and 0.99 (95% CI: 0.94–1.04) in Liberia compared with Guinea. All mutant genotypes had higher estimated fitness relative to the Makona genotype under the model with geographic effects due to a lower estimated fitness of the Makona genotype. However, the rank order of genotypic fitness values is consistent across models (Table 1). Overall, the population level fitness of all eight genotypes agree with their fitness in cell culture in terms of the sign or direction of their effects, but these genotypes had much greater fitness relative to the Makona genotype in cell culture than at the population level.

Influenza H3N2 fitness variation

We also applied the MFBD model to estimate the fitness effects of mutations in the hemagglutinin (HA) protein of human influenza virus subtype H3N2. Lee et al. (2018) recently estimated the relative preference for each amino acid residue at all sites in the HA protein in cell culture using a reverse genetics approach known as deep mutational scanning (DMS). The fitness effect of mutating one amino acid to another is expected to correlate strongly with the relative preference for each amino acid in these experiments. We therefore sought to compare the population-level fitness effects of naturally occurring mutations estimated under the MFBD model with their fitness measured in vitro by DMS.

To minimize the effect of antigenic mutations, which would not be reflected in the DMS experiments, we limited our analysis to viral lineages in the same antigenic cluster as the A/Perth/16/2009 strain studied by Lee et al. (2018). We first estimated the fitness effects of the 17 most abundant mutations that reached a frequency of 10% or greater among viruses sampled in the United States between 2009 and 2012. We found no apparent relationship between the estimated population-level fitness effects of these mutations and their in vitro effects, although there is agreement that most of these mutations are nearly neutral (Figure 8A). Although the 95% credible intervals on our estimates are narrow, these results need to be interpreted with extreme caution because our MCMC algorithm never converged on a stable posterior distribution for several mutations (Effective Sample Size < 10) due to strong correlations between mutations in their estimated fitness effects. This is likely due to the fact that many of these mutations occur only once in the phylogeny and share the same ancestry and therefore genetic background as other mutations in the phylogeny (Figure 8B).

Influenza H3N2 mutational fitness effects.

(A) The fitness effects of mutations estimated in vitro using deep mutational scanning versus their estimated population-level effects. In vitro fitness effects were quantified as the relative preference for the mutant versus the consensus amino acid residue in the deep mutational scanning experiments, given on a log2 scale. Population-level fitness effects were estimated using the MFBD model assuming multiplicative effects across sites. Error bars show the 95% credible intervals on the estimated population-level fitness effects. (B) Coancestry matrix showing the fraction of ancestry shared between each pair of mutations in the H3N2 phylogeny. The coancestry value represents the fraction of branches in the phylogeny that share both mutations based on a maximum parsimony reconstruction. The diagonal gives the fraction of all branches in the phylogeny with each individual mutation.

https://doi.org/10.7554/eLife.45562.010

While our population-level estimates did not correlate with the in vitro data, the fitness effects predicted by DMS correlate strongly with the maximum frequency that naturally occurring mutations reach in the human population (Lee et al., 2018). We therefore sought to test whether using the DMS experimental data to inform the MFBD model about the fitness effects of mutations, rather than estimating them independently from the phylogeny, would result in a better fit of the model to the H3N2 phylogeny and sequence data. Doing so requires a fitness model that aggregates mutational fitness effects across sites and then maps this combined fitness to the population-level fitness of a lineage. In our model, we sum the mutational fitness effects predicted by the relative amino acid preferences across all sites to get a composite predictor of fitness: θDMS=kLlog2πk,iπk,l. We then use (Equation 25) to map θDMS to overall population-level fitness.

Fitting our model to the H3N2 phylogeny allows us to calibrate how the mutational fitness effects based on relative preferences scale to population-level fitness. Overall, large changes in θDMS, resulting from mutations to more or less preferred amino acid residues, have a relatively small impact on population-level fitness. Population-level fitness grows slowly and roughly linearly with mutations to more preferred amino acids (Figure 9; inset). Nevertheless, when mutational fitness effects are aggregated across all sites, there are substantial fitness differences between lineages (Figure 9). Relative to a hypothetical lineage bearing the consensus sequence, fitness ranges from 0.84 to 1.04 across lineages with many lineages having a relative fitness less than one, indicating a slightly deleterious mutation load. Accounting for these fitness differences results in the MFBD model informed by the DMS data fitting the H3N2 phylogeny substantially better (Log likelihood: −4184) than a model assuming all mutations are neutral (Log likelihood: −7510). As would be expected, lineages predicted to be more fit also tend to persist between influenza seasons. Most notably, a lineage with higher than average fitness circulates in 2009 and 2010 during the H1N1 pandemic. This lineages carries the T228A mutation, which is predicted to have a large beneficial effect in the DMS experiments. It is therefore tempting to speculate that this mutation may have conferred an advantage that helped seasonal H3N2 compete with the pandemic H1N1 virus.

Relative fitness of influenza H3N2 lineages circulating in the United States between 2009 and 2012.

Fitness values were reconstructed based on a fitness model that maps mutational fitness effects predicted based on deep mutational scanning experiments to population level fitness. The inset shows this fitness mapping for the model parameters with the highest posterior probability: α=0.0098 and κ=0.964. Uncertainty in ancestral amino acid sequences was taken into account by first computing the marginal site probability at each site. Ancestral fitness values were then reconstructed by marginalizing over all possible ancestral sequences using the marginal site probabilities.

https://doi.org/10.7554/eLife.45562.011

Discussion

Many assumptions are made in phylogenetics to model molecular evolution in a statistically tractable way. Historically, one of the most pervasive yet biologically questionable of these assumptions has been that sequences evolve neutrally along lineages, such the mutations do not feedback and alter the branching process shaping the phylogeny. Our marginal fitness birth-death (MFBD) model allows us to relax this core assumption in order to consider how non-neutral evolution at multiple sites affects sequence evolution, the fitness of lineages, and the overall branching structure of a phylogeny. While our approach is not exact in that it approximates genotype probabilities by assuming sites evolve independently when computing the marginal fitness of a lineage, we have shown that this approximation generally works well and only produces significant errors in rather extreme situations, such as the four genotype model with very strong selection or epistasis. While an earlier approach based on birth-death models allowed for lineage-specific fitness values to be inferred from the branching pattern of a phylogeny (Neher et al., 2014), this approach did not connect fitness back to the mutational process nor allow for the fitness effects of individual mutations or genotypes to be estimated. Using our approach, we demonstrated that the fitness effects of specific mutations can be estimated from simulated phylogenies under the MFBD with accuracy comparable to an exact multi-type birth death model. The MFBD model therefore provides a new, statistically powerful way of incorporating adaptive molecular evolution into phylodynamics.

The MFBD model allows us to exploit phylogenetic information about adaptive evolution that most methods for inferring selection from patterns in sequence data ignore. Currently, codon-substation models (Goldman and Yang, 1994; Muse and Gaut, 1994) and the related class of mutation-selection models (Yang and Nielsen, 2008) are by far the most widely used approach for inferring selection. These approaches rely on comparing sequence substitution patterns such as the dN/dS ratio of non-synonymous to synonymous substitutions across sites. These approaches can be very powerful when sequences from highly divergent taxa are compared, such that enough time has elapsed for multiple substitutions at a single site to have accumulated between lineages. But on the shorter timescales relevant to evolution within a population, substitution patterns like the dN/dS ratio are relatively insensitive to selection pressures and may produce misleading inferences of selection (Kryazhimskiy and Plotkin, 2008). For example, a highly beneficial non-synonymous mutation that occurs in a single lineage and then spreads through a population may produce a very low dN/dS ratio, indicative of purifying selection rather than adaptive evolution. In contrast, comparing the evolutionary dynamics of lineages with and without the mutation allows us to infer if that mutation confers a competitive advantage. Thus, considering the branching pattern of phylogenies provides additional information about molecular evolution not visible from substitution patterns in sequence data alone.

While new technologies increasingly allow researchers to quantify mutational fitness effects in vitro or even in vivo (Zanini and Neher, 2013; Thyagarajan and Bloom, 2014), how fitness measured in the lab translates to fitness in nature is largely unknown. This is especially pertinent for emerging pathogens whose epidemic potential often depends on new adaptive mutations (Antia et al., 2003; Longdon et al., 2014). Phylodynamic approaches like the MFBD model that can quantify fitness at the host population level are therefore greatly needed, as they offer a means to assess the epidemiological significance of mutant lineages. Extrapolating from our experience with Ebola, where the population-level fitness effects of each mutant genotype we considered matched the sign of their effect in cell culture, we suspect that fitness measured in the lab will generally agree with fitness in nature. This seems reasonable, as mutations that increase replication or cellular infectivity within hosts should generally promote transmissibility between hosts (e.g. Quinn et al., 2000; Fraser et al., 2007). But at the same time, there is no reason to believe that transmission rates will increase linearly or even monotonically with increasing within-host growth rates. We therefore expect that the magnitude of fitness effects might often greatly differ across scales, as we found for the A82V glycoprotein mutation in Ebola. While A82V doubles infectivity in cell culture, we estimated that it only increases transmissibility at the population level by 5% (95% CI: 4–7%). Interestingly, (Diehl et al., 2016) found that A82V only slightly increases viral titers in Ebola patients, which is likely a much better proxy for transmissibility than cellular infectivity, lending support to our more moderate estimates at the population level.

For influenza, we were unable to reliably estimate the fitness effects of individual mutations from the H3N2 phylogeny. We believe that these inference problems likely stem from the fact that many of these mutations occur only once in the phylogeny and in the same genetic background as other mutations in the HA protein. The shared phylogenetic ancestry of mutations creates an identifiability problem akin to the problem of collinearity in more standard regression-type models. In either case, the individual effects of highly correlated variables are difficult or impossible to infer. Nevertheless, including the mutational fitness effects predicted by deep mutational scanning experiments improved the fit of the MFBD model to the H3N2 phylogeny by thousands of log likelihood units. Accounting for these fitness effects in the MFBD model also revealed substantial variation in population-level fitness among viral lineages within a single antigenic cluster. Most lineages were reconstructed to have a slightly deleterious mutation load, consistent with earlier reports that background variation in fitness arising from deleterious mutations, not just antigenic mutations, plays a large role in determining which H3N2 lineages ultimately persist (Illingworth and Mustonen, 2012; Luksza and Lässig, 2014; Koelle and Rasmussen, 2015). Moreover, the fitness variation uncovered by our analysis likely represents only the ‘tip of the iceberg’, since there are likely mutations in other genomic segments besides HA with large fitness effects (Raghwani et al., 2017), which we did not consider.

The influenza analysis highlights some of the inevitable difficulties encountered when inferring mutational fitness effects from phylogenies. Increasing the number of sites under consideration also increases the complexity of the genetic background in which mutations occur due to the increased probability of mutations being linked to other mutations rather than occurring in isolation. This leads to strong correlations between the fitness effects of different sites in an increasingly high dimensional parameter space, making statistical inference challenging, especially using MCMC methods. Spurious correlations may also arise due to additional, unmodeled sources of fitness variation. For example, if a mutation occurs coincidently with another beneficial mutation or the mutation occurs by chance along a lineage spreading through a higher fitness environment, it will likely be inferred to increase fitness even if it is actually neutral. In the future, the MFBD should therefore be extended to account for unmodeled sources of fitness variation. For example, each lineage could be assigned a random fitness effect representing the unmodeled components of fitness variation. These random effects could then be modeled as a continuous trait evolving along lineages, such that more closely related lineages would be expected to have similar fitness and overly large changes between closely related lineages would be penalized. Such a model would then allow us to say whether a fitness effect attributed to a particular mutation could be equally well explained by random effects arising from unmodeled fitness variation. Until such a principled approach is implemented, the fitness effects of individual mutations need to be interpreted carefully unless they occur in multiple genetic backgrounds and confounding sources of fitness variation can be accounted for, as we tried to do for Ebola by including potentially confounding geographic fitness effects.

In spite of these shortcomings, we believe the MFBD model offers a powerful means to explore many questions not previously possible with strictly neutral phylodynamic models. Even if the fitness effects of individual mutations are not identifiable, it may still be possible to infer the distribution of fitness effects across sites, a key determinant of adaptive evolution that has only been explored in a few systems (Eyre-Walker and Keightley, 2007). The MFBD model can also be used to compare the fitness of a mutation or lineage across different environments, such as in different hosts of a pathogen. Finally, the MFBD is not limited to exploring sequence evolution, as the model is generalizable to any discrete character state, including phenotypic, geographic or environmental characters. Thus, more generally, our model can be thought of as a multi-trait, multi-type birth-death model that can be used to explore how different molecular and non-molecular characters interact to shape the overall fitness of lineages in a phylogeny.

References

  1. 1
  2. 2
  3. 3
  4. 4
    Population Genetics of Molecular Evolution
    1. CD Bustamante
    (2005)
    Springer.
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
    Sampling theory for neutral alleles in a varying environment
    1. RC Griffiths
    2. S Tavaré
    (1994)
    Philosophical Transactions of the Royal Soceity B Biological Sciences 344:403–410.
    https://doi.org/10.1098/rstb.1994.0079
  15. 15
  16. 16
  17. 17
  18. 18
    The coalescent process in models with selection
    1. NL Kaplan
    2. T Darden
    3. RR Hudson
    (1988)
    Genetics 120:819–829.
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
    An integrated framework for the inference of viral population history from reconstructed genealogies
    1. OG Pybus
    2. A Rambaut
    3. PH Harvey
    (2000)
    Genetics 155:1429–1437.
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
    Insights into the early epidemic spread of ebola in sierra Leone provided by viral sequence data
    1. T Stadler
    2. D Kühnert
    3. DA Rasmussen
    4. L du Plessis
    (2014)
    PLOS Currents, 6, 10.1371/currents.outbreaks.02bc6d927ecee7bbd33532ec8ba6a25f, 25642370.
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
    Ebola situation reports
    1. WHO
    (2016)
    Ebola situation reports, http://apps.who.int/ebola/ebola-situation-reports.
  51. 51
  52. 52

Decision letter

  1. Aleksandra M Walczak
    Reviewing Editor; École Normale Supérieure, France
  2. Diethard Tautz
    Senior Editor; Max-Planck Institute for Evolutionary Biology, Germany
  3. Trevor Bedford
    Reviewer; Fred Hutchinson Cancer Research Center, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Coupling adaptive molecular evolution to phylodynamics using fitness-dependent birth-death models" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Diethard Tautz as the Senior Editor. The following individuals involved in review of your submission have agreed to reveal their identity: Trevor Bedford (Reviewer #1).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

The reviewers and editor agree that the subject of introducing selection and fitness into phylogenic inference is extremely important, neglected and very hard. For these reasons everyone is very enthusiastic about this effort. Nevertheless, all reviewers raise concerns both about presentation and about validation and applications. The reviewers should be understanding if not all their suggestions can be incorporated but we would urge you to try and satisfy them as much as possible, and very clearly explain instances where you cannot follow their suggestions. In our opinion, given the scope of eLife and that many readers will probably be less expert in probability theory than the reviewers, we think it is extremely important to be very clear about the assumptions and where the method works and where it falls short (see comments by reviewer 2). In light of this, reviewer 1 comments that testing different systems will provide such intuition for the less mathematically trained. You do not have to worry about selling the method – I think this is a point where we also learn from negative results.

We are including the reviewers' comments in full, since they are all instructive and each is very different.

Reviewer #1:

I've been working in the field of phylodynamics for the last several years, and I can say that it's widely acknowledged how difficult it's been to bridge the more traditional craft of demographic inferences of population size and migration to more fundamental evolutionary insights of strain fitnesses. I applaud the efforts of Rasmussen and Stadler to tackle such an important issue in their manuscript "Coupling adaptive molecular evolution to phylodynamics using fitness-dependent birth-death models". I believe this work represents an innovative modeling avenue to address fitness in phylodynamic inference and will eventually lead to many fruitful biological insights.

I found the model explication clear and well motivated. Although I admit difficulty with the full mathematical details, I could certainly follow the overall approach and was given a good sense of what approximations had been taken and why.

I believe the core modeling advance is completely sound and warrants publication. However, I believe that for eLife in particular, a manuscript should point the way towards biological insight. The application to Ebola in West Africa is not as biologically exciting as other systems; the method appears to work, but there's just little adaptive evolution going on. I believe the manuscript would find more traction with readers if it showed a clear example of elucidating impactful fitness differences in an evolving population.

Major suggestions:

1) My central suggestion would be to include a second biological example in which strain variation should be more apparent. Given David Rasmussen's expertise and the known biology of the system, I would suggest H3N2 influenza. Working with influenza phylogenies, there are often clear cut examples of new mutations emerging that rapidly increase in the population (such as HA1 F159Y, https://nextstrain.org/flu/seasonal/h3n2/ha/12y?c=gt-HA1_159). If a few of these genotypes are selected (such as those dictated by the Koel 7, see Koel et al., 2013), are there fitness differences visible in the phylogeny?

I'm not suggesting to do any sort of formal prediction from these fitnesses. Just to see if H3N2 phylogenies give reasonable model outputs.

Alternatively, if flu doesn't work as a motivating example, this should be discussed, as it's so obvious. I don't know if the low sampling fraction would pose an issue to the model.

There were a couple issues I had with Ebola analysis. Nothing fundamental, but issues that should be addressed to get at the biology. These were:

2) I remain concerned about the conflation of epidemiological circumstance and evolutionary fitness in these inferences (Bedford and Malik, 2016, Cell). In particular, the GP A82V mutation corresponds cleanly to a transition from early circulation in Guinea to circulation in Sierra Leone. Could this be addressed technically by adding an additional "site" to the model distinguishing between Guinea, Sierra Leone and Liberia?

3) It seems funny to me to only look the subset of sites investigated by Urbanowicz et al. in the lab. There were other common mutations in GP that weren't looked at by Urbanowicz et al. For example, both R321K and E580G (https://nextstrain.org/ebola?c=gt-GP_321,580) are significantly more common than R29K, I371V and K439E (https://nextstrain.org/ebola?c=gt-GP_29,371,439). Thus, it's weird to me to present a table that lists fitness impacts of several mutations, when there are other mutations that are largely colinear with the presented mutations, that may have actually been the causal variants.

I have this same issue with genes outside of GP. There are highly successful mutations like NP R111C (https://nextstrain.org/ebola?c=gt-NP_111) that are not accounted for. These may be driving the fitness differences observed. For example, L177A visually correlates better with the successful clade than GP E410V, which received a lot of model weight.

Is it possible to put forth a more systematic approach here? I don't think with what's been done that it's possible to conclude much about genetic variants driving transmission differences during the West African epidemic. I worry that handing out an approach as described will lead to people cherry picking what sites they use to define strains in subsequent analyses. There should be some guidance for how to approach this objectively.

Reviewer #2:

The manuscript "Coupling adaptive molecular evolution to phylodynamics using fitness-dependent birth-death model" introduces a probabilistic framework to estimate the joint distribution of the sequence data and of the phylogenetic tree, in which the tree is affected by the mutational process. The authors first introduce a general set of coupled different equations for the backwards evolution of the probabilities of observed subtrees and non-observed lineages. As these equations are intractable, a series of approximations to reduce the size of the problem are considered, including projections onto low-dimensional subspaces, decorrelation between sites in the genome… These simplified equations are solved in three cases: two toy models (small number of sites), for which the ground truth is known, and Ebola glycoprotein (GP) data sampled during the 2013-16 epidemic. Based on their computational approach, the authors can quantitatively estimate the fitness effects of 8 GP mutants.

I fully agree with the authors that estimating the likelihood of sequence data and tree altogether, without any unrealistic separation between the tree dynamics and the mutation processes, is an important and very challenging goal. Despite the biological importance of this issue, few works have considered the coupling between the tree dynamics and fitness so far, see for instance Neher, Russel and Shraiman (2014). The present work by Rasmussen and Stadler is an interesting and sophisticated step in this direction. I have however several criticisms against the current version of the manuscript, both on form and content.

My first comment is about the accessibility of the paper, more precisely, whether it is self-contained or not. While some compromise has always to be made between a full, self-contained presentation and the necessity to have a reasonable length, I think the manuscript should be made more accessible. I had to go to Stadler and Bonhoeffer (2013) to understand clearly the meaning of the main quantities (Dn,i and Ei); inserting a figure similar to Figure 1 of that paper would be useful. Equations 2 and 3, which are identical to the ones of Stadler and Bonhoeffer (apart from the presence of mutations – γi,j – instead of lineage – λi,j – rates), are also better explained in that paper. Furthermore, the presence of the normalization factor including the probability of being observed in Equation 5 is not really explained here. Again, the 2013 paper was helpful to understand the origin of this denominator.

Secondly, I agree that the set of coupled equations over the D and E variables is intractable, as its solution would require to track an exponentially large (in the genome length) number of quantities. There is no doubt that simplifying approximations are needed, but I am worried that the range of validity of these approximations is not adequately studied here. My concerns are:

1) The dimensionality of the problem is reduced from exponential to linear (in the number of sites) for the D probabilities, while the E probabilities are restricted to a discrete set of points in the fitness space. The corresponding equations, respectively, 19 and 15, are obtained by ad hoc modifications from the original evolution equations, i.e. they are not derived by clear methods that would help the reader understand the magnitude of the terms neglected. It is far from being clear if and when these approximations are acceptable. Again, I understand that approximations are needed, I simply would like to see some justification or some conditions under which the approximations would not affect too much the outcome. The necessity of dimensional reduction is not specific to computational evolutionary biology. Chemical master equations are also virtually of infinite dimensions, and cannot be solved without simplifications, in particular finite-state projections, reminiscent of the procedures followed in the present manuscript. The accuracy of these simplifications is a subject of concern and is studied, see for instance Munsky and Kammash (2006, J. Chem. Phys.).

2) Another drastic hypothesis, besides dimensional reduction, is the decorrelation of sites in the fitness function, see Equation 12. Obviously, epistatic effects are present in reality, and it is therefore important to understand how they would bias the estimates of the fitness effects predicted by the algorithm. This point is studied with the simulated 2-site model (bottom panels in Figure 1), but I was unable to understand what the conclusion (the obvious fact that errors grow with the strength of selection against the double mutant) entails for more complex cases. What can we expect as the genome size grows, and epistatic effects accumulate? Even in the absence of coupling in the fitness, the quality of the prediction of site-fitness effects seems to deteriorate rapidly with the genome size, from 2 to 10 sites, see top panels in Figure 3. The latter size is comparable to the number (9) of mutations studied for GP, but how much could the results of Table 1 affected by the other, not studied mutations in this Ebola virus protein?

3) Across a large phylogenetic tree, it is expected that the relationship between the fitness and the sequences varies across lineages, e.g. owing to changes in the environmental conditions. I wonder how much such effects could influence the predictions for the fitness effects output by the algorithm. To be more specific, suppose that the fitness f changes by some epsilon (depending on the lineage n), how much would the D and E probabilities change? I expect that this could be studied numerically in the small size models, or even analytically with linear-response theory. Assessing the robustness of the predictions against slight variations in the sequence-to-fitness relation is an important issue.

As a conclusion, this work is interesting and can be appreciated as a valuable effort to solve an important problem in phylogeny/sequences estimation. A substantial number of further investigations is, however, required to assess the validity and robustness of the method, before one can fully trust its predictions.

Reviewer #3:

This is an interesting paper that addresses the problem of jointly modeling mutation, selection, and sampling from a population, where the individuals are related through a phylogenetic tree. The idea is that selection can alter the expected shape of a tree, and therefore the tree shape is informative about both the genotypic state, and the selection coefficient.

The model builds upon an earlier work (Stadler and Bonhoeffer, 2013) in which a model for the evolution of a single site under selection was introduced. The current paper extends this model to multiple sites under selection. The authors achieve this by making additional assumptions, boiling down to assuming no linkage between these sites.

I am unsure about one aspect of the model – the initialisation of the variable D(t), which is initialized as the product of the probability a lineage is sampled upon death, s, and the rate that the lineage dies, d. I don't see why the death rate comes into this – if one state has a particularly low death rate, it seems that we should be likely to see it, rather than less likely – after all the pathogen does not need to die for the sample to be taken.

The authors show that they are able to infer the selection coefficient / fitness cost in a 4-genotype scenario, where the fitness cost is 0.5. In the application to actual data, the corresponding parameter is inferred to be in the range 5-15%, rather than 50%; and many of the inferences include 0 in their 95% confidence interval. As it stands it is hard to have strong confidence in these results, as there are no simulations to show that the model is sufficiently powered and provides reasonably unbiased estimates in this regime. It would be very helpful if the authors could provide a simulation with parameters in the regime appropriate for the real data.

https://doi.org/10.7554/eLife.45562.013

Author response

Reviewer #1:

I've been working in the field of phylodynamics for the last several years, and I can say that it's widely acknowledged how difficult it's been to bridge the more traditional craft of demographic inferences of population size and migration to more fundamental evolutionary insights of strain fitnesses. I applaud the efforts of Rasmussen and Stadler to tackle such an important issue in their manuscript "Coupling adaptive molecular evolution to phylodynamics using fitness-dependent birth-death models". I believe this work represents an innovative modeling avenue to address fitness in phylodynamic inference and will eventually lead to many fruitful biological insights.

I found the model explication clear and well motivated. Although I admit difficulty with the full mathematical details, I could certainly follow the overall approach and was given a good sense of what approximations had been taken and why.

I believe the core modeling advance is completely sound and warrants publication. However, I believe that for eLife in particular, a manuscript should point the way towards biological insight. The application to Ebola in West Africa is not as biologically exciting as other systems; the method appears to work, but there's just little adaptive evolution going on. I believe the manuscript would find more traction with readers if it showed a clear example of elucidating impactful fitness differences in an evolving population.

Thanks for your comments, Trevor. We acknowledge that Ebola virus has had limited time to adapt to the human population, and other rapidly evolving viruses like influenza exhibit more adaptation.

Major suggestions:

1) My central suggestion would be to include a second biological example in which strain variation should be more apparent. Given David Rasmussen's expertise and the known biology of the system, I would suggest H3N2 influenza. Working with influenza phylogenies, there are often clear cut examples of new mutations emerging that rapidly increase in the population (such as HA1 F159Y, https://nextstrain.org/flu/seasonal/h3n2/ha/12y?c=gt-HA1_159). If a few of these genotypes are selected (such as those dictated by the Koel 7, see Koel et al., 2013), are there fitness differences visible in the phylogeny?

I'm not suggesting to do any sort of formal prediction from these fitnesses. Just to see if H3N2 phylogenies give reasonable model outputs.

Alternatively, if flu doesn't work as a motivating example, this should be discussed, as it's so obvious. I don't know if the low sampling fraction would pose an issue to the model.

We have now also applied our method to influenza H3N2 as a second motivating example. In particular, we applied our method to look at fitness variation among strains in the Perth 2009 antigenic cluster. This allows for direct comparisons of population-level fitness effects estimated using our method with those estimated by Lee et al. (2018) in vitro using deep mutational scanning.

However, we found it very difficult to estimate site-specific mutational fitness effects for H3N2, even when we restricted our analysis to a small subset of naturally occurring mutations. We think that this is likely due to the fact that many of these mutations occur only once in the phylogeny and in the same genetic background as other mutations, and this shared common ancestry between mutations creates a problem with identifiability similar to the problem of co-linearity among highly correlated variables in regression-type models.

While this is disappointing, we also performed a second analysis where we let the deep mutational scanning data from Lee et al. inform our MFBD about the fitness effects of the naturally occurring mutations. This model fit the H3N2 phylogeny dramatically better than a model where mutations were assumed to be neutral. Moreover, reconstructing the ancestral fitness of lineages in the phylogeny under this model revealed substantial fitness variation among lineages within the Perth 2009 cluster, with most lineages carrying a mildly deleterious mutation load.

Given that previous work on H3N2 has argued for the importance of fitness variation within antigenic clusters in determining which lineages survive (Luska and Lässig, 2014; Koelle and Rasmussen, 2015), we believe this application to H3N2 provides another example of how our model can be applied to gain biological insights into the mechanisms of pathogen adaptation.

Looking at the F159Y mutation would have also been be interesting. But we only considered lineages circulating between 2009 and 2012 in our analysis and all lineages circulating before 2014 have the F allele.

There were a couple issues I had with Ebola analysis. Nothing fundamental, but issues that should be addressed to get at the biology. These were:

2) I remain concerned about the conflation of epidemiological circumstance and evolutionary fitness in these inferences (Bedford and Malik, 2016, Cell). In particular, the GP A82V mutation corresponds cleanly to a transition from early circulation in Guinea to circulation in Sierra Leone. Could this be addressed technically by adding an additional "site" to the model distinguishing between Guinea, Sierra Leone and Liberia?

This is an excellent point. We have now rerun the Ebola analysis with the geographic location as an additional “site” as suggested. However, we found no differences in transmission rates between locations. The A82V and other mutant genotypes are actually estimated to have slightly larger fitness effects when we account for geographic effects due to a lower estimated fitness for the Makona genotype. The rank order of the genotype fitness values is largely conserved. These results are now included in the main text.

3) It seems funny to me to only look the subset of sites investigated by Urbanowicz et al. in the lab. There were other common mutations in GP that weren't looked at by Urbanowicz et al. For example, both R321K and E580G (https://nextstrain.org/ebola?c=gt-GP_321,580) are significantly more common than R29K, I371V and K439E (https://nextstrain.org/ebola?c=gt-GP_29,371,439). Thus, it's weird to me to present a table that lists fitness impacts of several mutations, when there are other mutations that are largely colinear with the presented mutations, that may have actually been the causal variants.

I have this same issue with genes outside of GP. There are highly successful mutations like NP R111C (https://nextstrain.org/ebola?c=gt-NP_111) that are not accounted for. These may be driving the fitness differences observed. For example, L177A visually correlates better with the successful clade than GP E410V, which received a lot of model weight.

While we acknowledge that any of these additional mutations might be driving fitness differences between lineage, we don’t believe including these mutations in our analysis is as interesting as including geographic effects. There will likely always be additional unmodeled sites affecting fitness, but our method is not intended to provide a complete “genome scan” for sites under selection. Rather it is a way to formally test hypotheses about the fitness effects of particular mutations. Thus while our analysis is limited to the mutations studied by Urbanowicz et al., this is the intended use of our method and similar to what we imagine others will use the method for.

Is it possible to put forth a more systematic approach here? I don't think with what's been done that it's possible to conclude much about genetic variants driving transmission differences during the West African epidemic. I worry that handing out an approach as described will lead to people cherry picking what sites they use to define strains in subsequent analyses. There should be some guidance for how to approach this objectively.

We discuss issues with differentiating between the fitness effects of individual mutations versus unmodeled fitness variation in the Discussion. We also warn against over interpreting estimates of mutational fitness effects unless “they occur in multiple genetic backgrounds and confounding sources of fitness variation can be accounted for”. Furthermore, we think that including the additional Ebola analysis with geographic effects provides a good example of how conclusions about the fitness effects of particular mutations can and should be tested by including other potentially confounding sources of fitness variation in the analysis.

However, we agree with the reviewer that there is likely a more general way of accounting for unmodeled fitness effects. We now suggest one potential approach in the Discussion:

“In the future, the MFBD could be extended to account for unmodeled sources of fitness variation. For example, each lineage could be assigned a random fitness effect representing the unmodeled components of fitness variation. These random effects could then be modeled as a continuous trait evolving along lineages, such that more closely related lineages would be expected to have similar fitness due to shared genetic ancestry and overly large changes in fitness between closely related lineages would be penalized. Such a model would then allow us to say whether or not a mutational fitness effect could be equally well explained by random effects arising from unmodeled effects.”

Reviewer #2:

[…]

My first comment is about the accessibility of the paper, more precisely, whether it is self-contained or not. While some compromise has always to be made between a full, self-contained presentation and the necessity to have a reasonable length, I think the manuscript should be made more accessible. I had to go to Stadler and Bonhoeffer (2013) to understand clearly the meaning of the main quantities (Dn,i and Ei); inserting a figure similar to Figure 1 of that paper would be useful. Equations 2 and 3, which are identical to the ones of Stadler and Bonhoeffer (apart from the presence of mutations – γi,j – instead of lineage – λi,j – rates), are also better explained in that paper. Furthermore, the presence of the normalization factor including the probability of being observed in Equation 5 is not really explained here. Again, the 2013 paper was helpful to understand the origin of this denominator.

We have tried our best to make the model description as clear, precise and accessible as possible. But we envision our paper having two types of readers: those more interested in the biological applications and those more interested in the mathematical details. For the first type of reader, we have included a high-level summary of the model in Figure 1. For those interested in the mathematical details, we have chosen to devote more space to describing the new marginal fitness-birth-death model in detail than the original multi-type model. We agree with the reviewer that a reader interested in the full details of the original multi-type model may need to go back and consult Stadler and Bonhoeffer (2013).

We have made adjustments to improve accessibility as the reviewer suggested. Most notably, we have followed the reviewers suggestion and added a panel to Figure 1 illustrating the original multi-type birth-death model (including the Dn,I and Ei probabilities) and how this differs from the new marginal fitness birth-death model. We also now describe the normalization factor used in Equation 5 to condition the birth-death process on leaving at least on sampled descendent.

Secondly, I agree that the set of coupled equations over the D and E variables is intractable, as its solution would require to track an exponentially large (in the genome length) number of quantities. There is no doubt that simplifying approximations are needed, but I am worried that the range of validity of these approximations is not adequately studied here. My concerns are:

1) The dimensionality of the problem is reduced from exponential to linear (in the number of sites) for the D probabilities, while the E probabilities are restricted to a discrete set of points in the fitness space. The corresponding equations, respectively, 19 and 15, are obtained by ad hoc modifications from the original evolution equations, i.e. they are not derived by clear methods that would help the reader understand the magnitude of the terms neglected. It is far from being clear if and when these wild approximations are acceptable. Again, I understand that approximations are needed, I simply would like to see some justification or some conditions under which the approximations would not affect too much the outcome. The necessity of dimensional reduction is not specific to computational evolutionary biology. Chemical master equations are also virtually of infinite dimensions, and cannot be solved without simplifications, in particular finite-state projections, reminiscent of the procedures followed in the present manuscript. The accuracy of these simplifications is a subject of concern and is studied, see for instance Munsky and Kammash (2006, J. Chem. Phys.).

We agree with the reviewer that we gave too little attention to the validity of our approximations under different conditions in the original manuscript. Thus, instead of focusing our analysis of the four-genotype model on individual simulated phylogenies, we now more broadly explore how well the MFBD performs against the exact multi-type birth-death model under different evolutionary dynamics including over a wide range of mutation rates, selection coefficients and epistatic fitness effects. In particular, Figures 3 and 4 now compare the D and E probabilities along a lineage approximated under the MFBD model with the exact model. This new analysis shows that the MFBD approximates the D and E probabilities well under most conditions and that only under rather extreme conditions (very strong selection or epistatic interactions) do approximations introduce appreciable error. But even in these extreme cases the overall magnitude of error introduced is rather small.

In the revision, we also now give more attention to the approximation we used to track the E probabilities in discretized fitness space. While the reviewer sees our approximation as “wild”, we show that our approximation tracks the E probabilities well for a given lineage (see new Figure 4). In fact, using an even more dramatic approximation used in previous multi-type birth-death methods (Rabosky et al., 2014; Barido-Sottani et al., 2018), where transitions between states/fitness classes are ignored altogether along unobserved lineages, still provides accurate results. Thus, while neither approximation tracks the detailed dynamics of how unobserved lineages move through genotype space and thus fitness space, this does not seem necessary in order to capture the overall dynamics of how the E probabilities evolve backwards in time, which is all that is necessary for the MFBD model.

2) Another drastic hypothesis, besides dimensional reduction, is the decorrelation of sites in the fitness function, see Equation 12. Obviously, epistatic effects are present in reality, and it is therefore important to understand how they would bias the estimates of the fitness effects predicted by the algorithm. This point is studied with the simulated 2-site model (bottom panels in Figure 1), but I was unable to understand what the conclusion (the obvious fact that errors grow with the strength of selection against the double mutant) entails for more complex cases. What can we expect as the genome size grows, and epistatic effects accumulate? Even in the absence of coupling in the fitness, the quality of the prediction of site-fitness effects seems to deteriorate rapidly with the genome size, from 2 to 10 sites, see top panels in Figure 3. The latter size is comparable to the number (9) of mutations studied for GP, but how much could the results of Table 1 affected by the other, not studied mutations in this Ebola virus protein?

We expect that the between-site correlations we found for the two-site, four-genotype model would persist between any pair of sites, regardless of the total genome length, because we are assuming complete linkage between sites (no recombination). Thus, we always expect the correlations to become stronger as the strength of selection or epistatic fitness effects grows, regardless of genome length.

However, we do not believe these correlations cause the accuracy of our estimated site-specific fitness effects to decline with more sites. A far more likely reason for this decline is that the complexity of the genetic background in which mutations occur increases with more sites. Mutations therefore start to occur in the same genetic background as other mutations, making it difficult to estimate the effects of individual mutations. This leads to more uncertainty and variability in our estimates of site-specific effects, which explains the loss of accuracy with increasing numbers of sites.

3) Across a large phylogenetic tree, it is expected that the relationship between the fitness and the sequences varies across lineages, e.g. owing to changes in the environmental conditions. I wonder how much such effects could influence the predictions for the fitness effects output by the algorithm. To be more specific, suppose that the fitness f changes by some epsilon (depending on the lineage n), how much would the D and E probabilities change? I expect that this could be studied numerically in the small size models, or even analytically with linear-response theory. Assessing the robustness of the predictions against slight variations in the sequence-to-fitness relation is an important issue.

We also expect that the sequence-to-fitness relationship would vary across the phylogeny due to changes in environmental conditions or other unmodeled fitness effects, such as mutations at other sites. One way of assessing the impact of this background fitness variation on our fitness estimates is to directly incorporate potentially confounding variables into the model. Following the suggestion of reviewer #1, we now take this approach in our Ebola analysis by including the geographic location of lineages in our fitness model.

In the Discussion, we also now suggest a more principled approach to dealing with background variation in fitness due to unmodeled effects: random fitness effects could be assigned to individual lineages, allowing us to say whether a fitness effect ascribed to a particular mutation could equally well be explained by random effects representing unmodeled sources of fitness variation.

We could, as the reviewer suggests, also perform an analysis where we add background “noise” into the mapping between sequences and fitness, but it seems reasonable to expect that the quality of our estimates would depend entirely on the magnitude of this noise.

As a conclusion, this work is interesting and can be appreciated as a valuable effort to solve an important problem in phylogeny/sequences estimation. A substantial number of further investigations is, however, required to assess the validity and robustness of the method, before one can fully trust its predictions.

Reviewer #3:

This is an interesting paper that addresses the problem of jointly modeling mutation, selection, and sampling from a population, where the individuals are related through a phylogenetic tree. The idea is that selection can alter the expected shape of a tree, and therefore the tree shape is informative about both the genotypic state, and the selection coefficient.

The model builds upon an earlier work (Stadler and Bonhoeffer, 2013) in which a model for the evolution of a single site under selection was introduced. The current paper extends this model to multiple sites under selection. The authors achieve this by making additional assumptions, boiling down to assuming no linkage between these sites.

Yes, except to clarify we are actually assuming complete linkage between sites because we do not consider recombination. However, along an individual lineage, we are assuming correlations between the state of different sites can be ignored when calculating genotype probabilities. This is actually quite different from assuming no linkage in a standard population genetics model, where recombination would break up correlations between alleles at different sites at the population level. Our assumption is much more restricted: we are only assuming these correlations can be ignored at the level of an individual lineage.

I am unsure about one aspect of the model – the initialisation of the variable D(t), which is initialized as the product of the probability a lineage is sampled upon death, s, and the rate that the lineage dies, d. I don't see why the death rate comes into this – if one state has a particularly low death rate, it seems that we should be likely to see it, rather than less likely – after all the pathogen does not need to die for the sample to be taken.

In our version of the multi-type birth-death model, sampling is assumed to be coupled to death/removal events. A sampled individual must therefore have died to be sampled, and the D(t) probabilities need to reflect this. However, a lineage can die and not be sampled, and in this case the reviewer’s reasoning is correct: a lineage with a lower death rate will have a higher probability of surviving and therefore of being observed at a future sampling event.

The MTBD can be reformulated such that individuals can be sampled independently of death rates, but we would then need to take into account sampled ancestors where a sampled lineage can survive and produce other sampled lineages after the time of sampling. This can be done (see Kühnert et al., 2016), but we feel that this is beyond the scope of an already rather long paper.

The authors show that they are able to infer the selection coefficient / fitness cost in a 4-genotype scenario, where the fitness cost is 0.5. In the application to actual data, the corresponding parameter is inferred to be in the range 5-15%, rather than 50%; and many of the inferences include 0 in their 95% confidence interval. As it stands it is hard to have strong confidence in these results, as there are no simulations to show that the model is sufficiently powered and provides reasonably unbiased estimates in this regime. It would be very helpful if the authors could provide a simulation with parameters in the regime appropriate for the real data.

We do consider estimating site-specific fitness effects in the 5-15% range in Figure 6, where the mutational fitness effects in these simulations were chosen to randomly vary between ~0.4 and ~1.2. While these simulations were not exactly calibrated to the epidemiology of Ebola, they do show that our model can consistently estimate site-specific fitness effects over a wide range of birth rates, mutation rates, sampling fractions and even mutational fitness effects.

https://doi.org/10.7554/eLife.45562.014

Article and author information

Author details

  1. David A Rasmussen

    1. Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, United States
    2. Bioinformatics Research Center, North Carolina State University, Raleigh, United States
    Contribution
    Conceptualization, Software, Formal analysis, Validation, Visualization, Methodology, Writing—original draft, Writing—review and editing
    For correspondence
    drasmus@ncsu.edu
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9457-7561
  2. Tanja Stadler

    1. Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
    2. Swiss Institute of Bioinformatics, Lausanne, Switzerland
    Contribution
    Formal analysis, Supervision, Methodology, Writing—review and editing
    Competing interests
    No competing interests declared

Funding

Seventh Framework Programme (European Research Commission)

  • Tanja Stadler

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank Denise Kühnert for helpful advice on how to implement the MFBD model in BEAST 2, Katia Koelle for advice on the influenza H3N2 analysis, and Louis du Plessis for permission to reproduce the model schematic shown in Figure 1A.

Senior Editor

  1. Diethard Tautz, Max-Planck Institute for Evolutionary Biology, Germany

Reviewing Editor

  1. Aleksandra M Walczak, École Normale Supérieure, France

Reviewer

  1. Trevor Bedford, Fred Hutchinson Cancer Research Center, United States

Publication history

  1. Received: January 27, 2019
  2. Accepted: July 26, 2019
  3. Accepted Manuscript published: August 14, 2019 (version 1)
  4. Accepted Manuscript updated: August 15, 2019 (version 2)
  5. Version of Record published: August 29, 2019 (version 3)

Copyright

© 2019, Rasmussen and Stadler

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 675
    Page views
  • 118
    Downloads
  • 0
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Evolutionary Biology
    Jens Bast et al.
    Short Report Updated
    1. Developmental Biology
    2. Evolutionary Biology
    Yoseop Yoon et al.
    Research Article