Dengue genetic divergence generates within-serotype antigenic variation, but serotypes dominate evolutionary dynamics

Abstract
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

Dengue virus (DENV) exists as four genetically distinct serotypes, each of which is historically assumed to be antigenically uniform. Recent analyses suggest that antigenic heterogeneity may exist within each serotype, but its source, extent and impact remain unclear. Here, we construct a sequence-based model to directly map antigenic change to underlying genetic divergence. We identify 49 specific substitutions and four colinear substitution clusters that robustly predict dengue antigenic relationships. We report moderate antigenic diversity within each serotype, resulting in genotype-specific patterns of heterotypic cross-neutralization. We also quantify the impact of antigenic variation on real-world DENV population dynamics, and find that serotype-level antigenic fitness is a dominant driver of dengue clade turnover. These results provide a more nuanced understanding of the relationship between dengue genetic and antigenic evolution, and quantify the effect of antigenic fitness on dengue evolutionary dynamics.

Introduction

Dengue virus (DENV) is a mosquito-borne flavivirus which consists of four genetically distinct clades, canonically thought of as serotypes (DENV1 – DENV4) (Lanciotti et al., 1997). DENV circulates primarily in South America and Southeast Asia, infecting 400 million people annually. Primary DENV infection is more often mild and is thought to generate lifelong homotypic immunity and temporary heterotypic immunity, which typically wanes over 6 months to 2 years (Sabin, 1952; Reich et al., 2013; Katzelnick et al., 2016). Subsequent heterotypic secondary infection induces broad cross-protection, and symptomatic tertiary and quaternary cases are rare (Gibbons et al., 2007; Olkowski et al., 2013). However, a small subset of secondary infections are enhanced by non-neutralizing, cross-reactive antibodies, resulting in severe disease via antibody-dependent enhancement (ADE) (Halstead, 1979; Katzelnick et al., 2017; Sangkawibha et al., 1984; Salje et al., 2018). Approximately 1–3% of cases progress to severe dengue hemorrhagic fever, causing $\sim$ 9000 deaths each year (Bhatt et al., 2013; Stanaway et al., 2016) and relative risk of severe dengue from secondary heterotypic infection relative to primary infection is estimated to be $\sim$ 24 (Mizumoto et al., 2014). Thus, the antigenic relationships between dengue viruses — describing whether the immune response generated after primary infection results in protection or enhancement of secondary infection — are key drivers of DENV case outcomes and epidemic patterns.

While each serotype is clearly genetically and antigenically distinct, it is not clear how subserotype clades of DENV interact antigenically. Each DENV serotype consists of broad genetic diversity (Figure 1A), including canonical clades termed ‘genotypes’ (Rico-Hesse, 1990; Twiddy et al., 2003). Specific genotypes have been associated with characteristically mild or severe disease, and heterogeneous neutralization titers suggest that the immune response to some genotypes is more cross-protective than others (Gentry et al., 1982; Russell and Nisalak, 1967). Until recently, it has been assumed that these intraserotype differences are minimally important compared to interserotype differences. However, empirical evidence has demonstrated that these genotype-specific differences can drive case outcomes and epidemic severity (reviewed in Holmes and Twiddy, 2003). For example, analysis of a longitudinal cohort study demonstrated that specific combinations of primary infection serotype and secondary infection genotype can mediate individual case outcomes (OhAinle et al., 2011). On a population scale, the DENV1-immune population of Iquitos, Peru, experienced either entirely asymptomatic or very severe epidemic seasons in response to two different genotypes of DENV2 (Kochel et al., 2002).

Figure 1 with 1 supplement see all

Download asset Open asset

Phylogeny of dengue virus sequences and normalized antigenic distances.

(A) Maximum likelihood phylogeny of the E (envelope) gene from titered dengue viruses. Notably, each of the four serotypes contains substantial genetic diversity. (B) Pairwise antigenic distances were estimated by Katzelnick et al. (2015) using plaque reduction neutralization titers (PRNT50, see Materials and methods). Aggregated titer values are standardized such that the distance between autologous virus-serum pairs is 0, and each titer unit corresponds to a two-fold change in PRNT50 value. Light gray areas represent missing data. Larger values correspond to greater antigenic distance.

Figure 1—source data 1: https://cdn.elifesciences.org/articles/42496/elife-42496-fig1-data1-v3.csv
Download elife-42496-fig1-data1-v3.csv

One explanation for these and similar observations is that overlooked intraserotype antigenic variation contributes to these genotype-specific case outcomes and epidemic patterns. Recent efforts to antigenically characterize diverse DENV viruses suggests that each serotype may contain antigenic heterogeneity, but the source and impact of this heterogeneity is not clear (Katzelnick et al., 2015). Here, we characterize the evolutionary basis for observed antigenic heterogeneity among DENV clades. We also quantify the impact of within- and between-serotype antigenic variation on real-world DENV population dynamics.

Results

Dengue neutralization titer data

Antigenic distance between a pair of viruses $i$ and $j$ is experimentally quantified using neutralization titers, which measure how well serum drawn after infection with virus $j$ is able to neutralize virus $i$ in vitro (Russell and Nisalak, 1967). Throughout the following, we refer to serum raised against virus $j$ as serum $j$ for brevity. To measure the pairwise antigenic distances for a panel of diverse DENV viruses (Figure 1), Katzelnick et al. (2015) infected naive non-human primates (NHP) with each virus, drew sera at 3 months post-infection, and then titered this sera against a panel of test viruses. To compare patterns of cross-protection in NHP and humans, they also drew sera from 31 study participants 6 weeks after inoculation with a monovalent component of the NIH dengue vaccine candidate. This sera was also titered against a broad panel of DENV viruses. As originally reported, we find generally consistent patterns of neutralization between the NHP and human sera data; see Katzelnick et al. (2015) for a detailed comparison. In total, our dataset consists of 454 NHP sera titers spanning the breadth of DENV diversity, and 728 human sera titers providing deep coverage of a small subset of viruses.

To normalize these measurements, we take the log $_{2}$ of each value, such that one antigenic unit corresponds to a two-fold drop in neutralization, and we define antigenic distance between autologous serum-virus pairs (i.e. virus $i$ and serum $i$ ) as zero. Normalized antigenic distance between virus $i$ and serum $j$ is thus calculated as $D_{i j} = \log_{2} (T_{i i}) - \log_{2} (T_{i j})$ , such that a higher value of $D_{i j}$ indicates that serum $j$ is less effective at neutralizing virus $i$ , implying greater antigenic distance between viruses $i$ and $j$ . For brevity, these normalized titer values are hereafter referred to simply as log $_{2}$ (titers).

The full dataset of standardized titer values is shown in Figure 1B. Here, we see that homotypic virus-serum pairs are more closely related antigenically than heterotypic pairs. However, we also observe large variance around this trend, both within and between serotypes. This suggests that treating each serotype as antigenically uniform potentially overlooks important antigenic heterogeneity across viruses within each serotype.

Dengue antigenic evolution corresponds to genetic divergence

Titer measurements are prone to noise, and there is a limited amount of available titer data. If the antigenic heterogeneity observed in the raw data is truly the result of an underlying evolutionary process, we expect that differences in antigenic phenotype correspond to underlying mutations in surface proteins. Dengue has two surface proteins, prM (membrane) and E (envelope). While previous studies have identified epitopes on both prM and E, it is believed that antibodies involved in ADE primarily target prM, while neutralizing antibodies primarily target E (de Alwis et al., 2014). The assay used to generate this titer dataset captures neutralization, but does not capture the effects of ADE; we thus focus our analysis on the E gene.

To fully map the relationship between DENV genetic and antigenic evolution, we adapt a substitution-based model originally developed for influenza (Neher et al., 2016). Conceptually, this model predicts titer values through three steps. First, we align E gene sequences from titered dengue viruses and catalog the amino acid mutations between each serum strain and test virus strain in our dataset. Next, we infer how much antigenic change is attributable to specific mutations by constructing a parsimonious model that links normalized antigenic distances to observed mutations. This assigns each mutation $m$ an antigenic effect size, $d_{m} \geq 0$ ; forward and reverse mutations are assigned separate values of $d_{m}$ . With this in hand, we estimate the asymmetrical antigenic distance ${\hat{D}}_{i j}$ between all pairs of sera and test viruses by summing over $d_{m}$ for all mutations observed between the serum and the test virus (Materials and methods, Equation 2).

To learn these values of $d_{m}$ , we first split our dataset into training (random 90% of measurements) and test data (the remaining 10% of values). We take the training data and fit $d_{m}$ for each mutation that is observed two or more times, subject to regularization as follows (also detailed in Materials and methods, Equation 3). Parsimoniously, we expect that antigenic change is more likely to be incurred by a few key mutations than by many mutations; correspondingly, our prior expectation of values of $d_{m}$ is exponentially distributed such that most values of $d_{m} = 0$ . This is directly analogous to lasso regression to identify a few parameters with positive weights and set other parameters to 0 (Tibshirani, 1996). Additionally, some viruses have greater binding avidity, and some sera are more potent than others (Figure 1—figure supplement 1); these ‘row’ and ‘column’ effects, respectively, are normally distributed and are taken into account when training the model. The model uses convex optimization to learn the values of $d_{m}$ that minimize the sum of squared errors (SSE) between observed and predicted titers in the training data. We thus learn model parameters from the training data, and then use those parameters to predict test data values. We assess model performance by comparing the predicted test titer values to the actual values, aggregated across 100-fold Monte Carlo cross validation.

This model formulation is an effective tool for estimating antigenic relationships between viruses based on their genetic sequences. On average across cross-validation replicates, this model yields a root mean squared error (RMSE) of 0.75 when predicting titers relative to their true value (95% CI 0.74–0.77, RMSE), and explains 78% of the observed variation in neutralization titers overall (95% CI 0.77–0.79, Pearson $R^{2}$ ). This is comparable to the model error from a cartography-based characterization of the same dataset (RMSE 0.65–0.8 log $_{2}$ titer units) (Katzelnick et al., 2015). Prediction error was comparable between human and non-human primate sera, indicating that these genetic determinants of antigenic phenotypes are not host species-specific (Figure 2—figure supplement 2).

The 48 strains included in the titer dataset (as serum strains, test virus strains, or both) are 25.7% divergent on average (amino acid differences in E). Pairwise comparisons of all serum strains and test viruses yields 1534 unique mutations that are observed at least twice. Our parsimonious model attributes antigenic change to a total of 49 specific mutations and four colinear mutation clusters (each consisting of 2–6 co-occurring mutations) (Figure 2, Table 1). Each of these mutations confers 0.01–2.11 (median 0.19) log $_{2}$ titer units of antigenic change; 27 mutations or mutation clusters have $d_{m} \geq 0.2$ . These mutations span all domains of E, and most occur both between and within serotypes (Figure 2).

Figure 2 with 2 supplements see all

Download asset Open asset

Distribution and effect size of antigenic mutations.

Each point represents one antigenically relevant mutation or colinear mutation cluster. Clustered mutations are connected with dashed lines with point size proportionate to cluster size (N = 2–6). The x axis indicates mutations’ position in E, relative to each functional domain as noted in (B). The y axis indicates antigenic effect size.

Figure 2—source data 1: https://cdn.elifesciences.org/articles/42496/elife-42496-fig2-data1-v3.csv
Download elife-42496-fig2-data1-v3.csv

Table 1

Antigenically relevant mutations.

Each entry represents a mutation (or colinear cluster of mutations) inferred by the titer model to have a non-zero antigenic effect size $d_{m}$ (shown in parentheses).

I6V, S29G, F90Y, T176P, V197I, L475M (0.71)	D154E (0.03)	T339I (0.65)
A19T (0.02)	K160V (0.03)	V347A (0.28)
N83K (0.34)	E161T (0.22)	I380V (0.45)
A88K (0.08)	A162I (0.19)	V382A (0.04)
A88Q (1.39)	I162A (0.29)	V382I (0.37)
Y90F (0.43)	I164V (0.01)	N385K, V454I (0.12)
V91I (0.35)	T171S (0.23)	D390S (0.12)
K93R (0.20)	V174E (0.46)	N390S (0.18)
V114I (0.90)	E180T (0.43)	V454T (0.25)
L122S (0.03)	N203E (0.04)	V461F (0.01)
S122L (0.07)	N203K (0.08)	F461I (0.03)
N124K (0.10)	D203N (0.14)	I462V (0.10)
V129I (0.01)	E203N (1.09)	L462I (0.16)
I129V (0.21)	E203D (1.34)	V462L (2.11)
I129A (0.23)	I308V (0.18)	T478S (0.10)
Y132I (0.12)	G330D (0.22)	S478M (0.43)
Y132P, R233Q (0.34)	I335V, N355T, P364V(0.18)	V484I (0.39)
I139V (0.05)	L338E (0.43)

Each serotype of dengue contains moderate antigenic heterogeneity

By linking antigenic change to specific mutations, we are able to estimate unmeasured antigenic distances between any pair of viruses in the dataset based on their genetic differences. As an example, we estimated the antigenic distance between serum raised against each monovalent component of the NIH vaccine candidate and all other viruses in the dataset. As shown in Figure 3, vaccine-elicited antibodies result in strong homotypic neutralization, but heterotypic cross-neutralization varies widely between specific strains. This has important ramifications for vaccine design and trial evaluation.

Figure 3

Download asset Open asset

Antigenic distance from NIH vaccine strains.

By assigning a discrete increment of antigenic change to each mutation, we can estimate the asymmetrical antigenic distance between any serum strain and test virus strain based on their genetic differences. Here, we show the estimated antigenic distance between serum raised against each monovalent component of the NIH vaccine candidate (indicated as ‘X’) and each test virus in the tree.

We also observe antigenic heterogeneity at the genotype level. On average, heterotypic genotypes are separated by 6.9 antigenic mutations (or colinear mutation clusters) and 2.18 log $_{2}$ titers. Homotypic genotypes are separated by a mean of 1.9 antigenic mutations, conferring a total of 0.30 log $_{2}$ titers of antigenic distance (Figure 4). Notably, the titer dataset spans the breadth of canonical DENV genotypes, but in most cases lacks the resolution to detect within-genotype antigenic diversity. We thus expect that these results represent a lower bound on the true extent of DENV intraserotype antigenic diversity.

Figure 4

Download asset Open asset

Titer distance by genotype.

Values represent the mean interpolated antigenic distance between canonical dengue genotypes (in standardized log $_{2}$ titer units). Columns represent sera; rows represent test viruses.

Figure 4—source data 1: https://cdn.elifesciences.org/articles/42496/elife-42496-fig4-data1-v3.csv
Download elife-42496-fig4-data1-v3.csv

In summary, we have identified a small number of antigenically relevant mutations that explain most of the observed antigenic heterogeneity in dengue, as indicated by neutralization titers. These mutations occur both between and within serotypes, suggesting that dengue antigenic evolution is an ongoing, although gradual, process. This results in strain-specific and genotype-level antigenic variation, although the scale of this variation is small compared to serotype-level differences. From this, we conclude that there is antigenic variation within each serotype of DENV, and that this is driven by underlying genetic divergence.

Antigenic novelty predicts serotype success

From the titer model, we find evidence that homotypic genotypes of DENV vary in their ability to escape antibody neutralization. However, antibody neutralization is only one of many factors that shape epidemic patterns. We investigate whether the observed antigenic diversity influences dengue population dynamics in the real world.

The size of the viral population (i.e. prevalence, commonly analyzed using SIR models as reviewed in Lourenço et al., 2018) is determined by many complex factors, and reliable values for population prevalence are largely unavailable. Contrastingly, the composition of the viral population (i.e. the relative frequency of each viral clade currently circulating) can be estimated over time by examining historical sequence data (Lee et al., 2018; Neher et al., 2016), and is primarily driven by viral fitness (Bedford et al., 2011).

In meaningfully antigenically diverse viral populations, antigenic novelty (relative to standing population immunity) contributes to viral fitness: as a given virus $i$ circulates in a population, the proportion of the population that is susceptible to infection with $i$ –and other viruses antigenically similar to $i$ –decreases over time as more people acquire immunity (Bedford et al., 2012; Luksza and Lässig, 2014). Antigenically novel viruses that are able to escape this population immunity are better able to infect hosts and sustain transmission chains, making them fitter than the previously circulating viruses (Zhang et al., 2005; Bedford et al., 2012; Gupta et al., 1998; Wearing and Rohani, 2006; Lourenço and Recker, 2013). Thus, if antigenic novelty constitutes a fitness advantage for DENV, then we would expect greater antigenic distance from recently circulating viruses to correlate with higher growth rates.

To test this hypothesis, we examine the composition of the dengue virus population in Southeast Asia from 1970 to 2015. We estimate the relative population frequency of each DENV serotype at three month intervals, $x_{i} (t)$ (Figure 5A), based on their observed relative abundance in the ‘slice’ of the phylogeny corresponding to each timepoint (N = 8644 viruses; see Materials and methods, Equations 4-5). While there is insufficient data to directly compare these estimated frequencies to regional case counts, we see good qualitative concordance between frequencies similarly estimated for Thailand and previously reported case counts from Bangkok (Figure 5—figure supplement 1).

Figure 5 with 3 supplements see all

Download asset Open asset

Antigenic novelty predicts serotype success.

(A) The relative frequency of each serotype, $x_{i}$ , in Southeast Asia estimated every 3 months based on available sequence data. (B) Total fitness of each serotype. We calculate antigenic fitness for each serotype over time as its frequency-weighted antigenic distance from recently circulating viruses. We then add this to a time-invariant intrinsic fitness value to calculate total fitness. (C) DENV1 frequencies between 1994 and 1996 alongside model projection. At each timepoint $t$ , we blind the model to all empirical data from timepoints later than $t$ and predict each serotype’s future trajectory based on its initial frequency, time-invariant intrinsic fitness, and antigenic fitness at time $t$ (Materials and methods, Equation 11). We predict forward in 3-month increments for a total prediction period of $d t = 2$ years. At each increment, we use the current predicted frequency to adjust our estimates of antigenic fitness on a rolling basis (Materials and methods, Equation 15). (D) Predicted growth rates, $\hat{g} = \frac{\hat{x_{i}} (t + d t)}{x_{i} (t)}$ , compared to empirically observed growth rates, $g = \frac{x_{i} (t + d t)}{x_{i} (t)}$ . Predicted and empirical growth rate of the example illustrated in (C) is shown in (D) as the blue point. Serotype growth versus decline is accurate (i.e. the predicted and actual growth rates are both $> 1$ or both $< 1$ , all points outside the gray area) for 66% of predictions.

Figure 5—source data 1: https://cdn.elifesciences.org/articles/42496/elife-42496-fig5-data1-v3.csv
Download elife-42496-fig5-data1-v3.csv

Table 2

Optimized fitness model parameters for primary analysis.

Parameter	Value	Description
$β$	1.02	Slope of linear relationship between population immunity and viral fitness
$γ$	0.83	Proportion of titers waning each year since primary infection
$σ$	0.76	Slope of linear relationship between titers and probability of protection
$f_{0}^{(1)}$	0.74	Relative intrinsic fitness of DENV1
$f_{0}^{(2)}$	0.84	Relative intrinsic fitness of DENV2
$f_{0}^{(3)}$	0.50	Relative intrinsic fitness of DENV3
$f_{0}^{(4)}$	0.00	Relative intrinsic fitness of DENV4 (fixed)

Table 3

Fitness model performance comparisons.

Here we compare the performance of the antigenically-informed fitness models to model performance under two null formulations. In the ‘equal’ null model, all clades are assigned equal fitness (i.e. antigenic and intrinsic fitness are set to 0, Equation 17; Equation 18). In the ‘intrinsic’ null model formulation, only the serotype-specific, time-invariant intrinsic fitness values contribute to clade fitness (i.e. antigenic fitness is set to 0, Equation 19; Equation 20). For both formulations of generalized waning, all other parameters were set to the values reported in Table 2 (optimized for RMSE). Parameters for heterotypic waning were optimized separately.

Resolution	Fitness model	Waning	RMSE	Pearson $R^{2}$	Accuracy
Serotype	Interserotype	Generalized	0.105	0.547	0.660
Serotype	Equal fitness null	Generalized	0.130	0.000	0.480
Serotype	Intrinsic fitness null	Generalized	0.140	0.042	0.510
Genotype	Interserotype	Generalized	0.062	0.286	0.666
Genotype	Intergenotype	Generalized	0.062	0.254	0.610
Genotype	Equal fitness null	Generalized	0.070	0.000	0.440
Genotype	Intrinsic fitness null	Generalized	0.072	0.032	0.530
Serotype	Interserotype	Heterotypic	0.109	0.533	0.666
Genotype	Interserotype	Heterotypic	0.063	0.291	0.661
Genotype	Intergenotype	Heterotypic	0.063	0.203	0.599

Fitter virus clades increase in frequency over time, such that $x_{i} (t + d t) > x_{i} (t)$ . It follows that these clades have a growth rate—defined as the fold-change in frequency over time—greater than one: $\frac{x_{i} (t + d t)}{x_{i} (t)} > 1$ . To isolate the extent to which antigenic fitness contributes to clade success and decline, we extend work by Luksza and Lässig (2014) to build a simple model that attempts to predict clade growth rates based on two variables: the antigenic fitness of the clade at time $t$ , and a time-invariant-free parameter representing the intrinsic fitness of the serotype the clade belongs to. We estimate the antigenic fitness of clade $i$ at time $t$ as a function of its antigenic distance from each viral clade $j$ that has circulated in the same population over the previous 2 years, weighted by the relative frequency of $j$ and adjusted for waning population immunity (Figure 5B; Materials and methods, Equation 6-10). Growth rates are estimated based on a 2-year sliding window (Figure 5C).

This simple model explains 54.7% of the observed variation in serotype growth rates, and predicts serotype growth vs. decline correctly for 66.0% of predictions (Figure 5D). This suggests that antigenic fitness is a major driver of serotype population dynamics. This also demonstrates that this model captures key components of dengue population dynamics; examining the formulation of this model in more detail can yield insights into how antigenic relationships influence DENV population composition. The fitness model includes six free parameters that are optimized such that the model most accurately reproduces the observed fluctuations in DENV population composition (minimizing the RMSE of frequency predictions, see Materials and methods). We find that serotype fluctuations are consistent with a model wherein population immunity wanes linearly over time, with the probability of protection dropping by about 63% per year for the first 2 years after primary infection. This model assumes no fundamental difference between homotypic and heterotypic reinfection; rather, homotypic immunity is assumed to wane at the same rate as heterotypic immunity, but starts from a higher baseline of protection based on closer antigenic distances. We also find that these dynamics are best explained by intrinsic fitness that moderately varies by serotype (Table 2); we are not aware of any literature that directly addresses this observation via competition experiments. However, intrinsic fitness alone is unable to predict serotype dynamics (Table 3) and relative strength of antigenic fitness and intrinsic fitness are approximately matched in determining overall serotype fitness.

Antigenic novelty also partially predicts genotype success

To estimate how well antigenic fitness predicts genotype dynamics, we used the same model to predict genotype success and decline. As before, fitness of genotype $i$ is based on the intrinsic fitness of the serotype $i$ belongs to, and the antigenic distance between $i$ and each other genotype, $j$ , that has recently circulated (Equation 6-10). For genotypes, we can calculate antigenic distance between $i$ and $j$ at either the serotype level or the genotype level. In the ‘interserotype model’, we treat each serotype as antigenically uniform, and assign the mean serotype-level antigenic distances to all pairs of constituent genotypes. In the ‘intergenotype model’, we incorporate the observed within-serotype heterogeneity, and use the mean genotype-level antigenic distances (as shown in Figure 4). If within-serotype antigenic heterogeneity contributes to genotype fitness, then we would expect estimates of antigenic fitness based on the ‘intergenotype model’ to better predict genotype growth rates.

We find that antigenic fitness contributes to genotype turnover, although it explains less of the observed variation than for serotypes. As for serotypes, intrinsic fitness alone was unable to predict genotype turnover (Table 3). When antigenic distance is estimated from the ‘interserotype model’, we find that our model of antigenic fitness explains approximately 28.6% of the observed variation in genotype growth rates, and correctly predicts genotype growth vs. decline 66.6% of the time (Figure 6C). Perhaps surprisingly, more precise estimates of antigenic distance between genotypes from the ‘intergenotype model’ does not improve our predictions of genotype success ( $R^{2} = 0.254$ , 61.0% accuracy; Figure 6D, Table 3). This suggests that although we find strong evidence that genotypes vary in their ability to escape neutralizing antibodies, these differences are subtle enough that they do not impact broad-scale regional dynamics over time.

Figure 6

Download asset Open asset

Antigenic novelty partially predicts genotype success.

(A) Relative frequencies of each canonical dengue genotype across Southeast Asia, estimated from available sequence data. (B) Antigenic fitness is calculated for each genotype as its frequency-weighted antigenic distance from recently circulating genotypes. We then add this to a time-invariant, serotype-specific intrinsic fitness value to calculate total fitness (shown here, arbitrary units). We assess antigenic distance at either the ‘intergenotype’ or the ‘interserotype’ resolution. In this panel, we show total fitness over time, incorporating estimates of antigenic fitness derived from the ‘intergenotype’ model. (**C, D**) Fitness estimates were used to predict clade growth rates over 2 years, compounding immunity every 3 months based on predicted frequency changes (Materials and methods Equation 15). Here, we compare observed vs. predicted growth rates for both formulations of the fitness model (using fitness derived from either ‘interserotype’ or ‘intergenotype’ antigenic distances). Growth versus decline was accurate (predicted and actual growth rates both $> 1$ or both $< 1$ , points outside the gray shaded area) for 67% and 61% of predictions, respectively.

Figure 6—source data 1: https://cdn.elifesciences.org/articles/42496/elife-42496-fig6-data1-v3.csv
Download elife-42496-fig6-data1-v3.csv

Discussion

Within-serotype antigenic heterogeneity

We show that mapping antigenic change to specific mutations and interpolating across the DENV alignment is able to explain a large majority of the observed variation in antigenic phenotypes, as measured by neutralization titers. We identify 49 specific mutations and 4 colinear mutation clusters that contribute to antigenic variation, of which 27 mutations or mutation clusters have an antigenic impact of 0.20 log $_{2}$ titers or greater. These mutations span all major domains of E, and occur both within and between serotypes. This demonstrates that DENV antigenic divergence is closely coupled to genetic divergence. We use these mutations to infer unmeasured antigenic relationships between viruses, revealing substantial within-serotype antigenic variation. For comparison, we reconstructed the ancestral sequence of each serotype and constrained the model to only permit antigenic change to be attributed to these serotype-level differences. While this interserotype-only model predicts titers to a reasonable degree, we find that it has higher error (RMSE = 0.86) than the full model which accounts for within-serotype heterogeneity (RMSE = 0.79; Table 4). This supports and expands upon previous reports (Katzelnick et al., 2015; Forshey et al., 2016; Waggoner et al., 2016) that the null hypothesis of antigenically uniform serotypes is inconsistent with observed patterns of cross-protection and susceptibility.

Table 4

Titer model performance comparisons.

We compared performance across several different variations of the titer model. As described in Neher et al. (2016), incremental antigenic change can be assigned to either amino acid substitutions (‘Substitution’ model) or to branches in the phylogeny (‘Tree’ model). For each of these models, we can constrain the model such that antigenic change is allowed to occur only between serotypes (‘Interserotype’) or between AND within serotypes (‘Full’). For the substitution model, we constrain the interserotype model by reconstructing the amino acid sequence of the most recent common ancestor for each serotype and allowing the model to assign antigenic change only to mutations between these ancestral sequences. For the tree model, we constrain the interserotype model by allowing the model to assign antigenic change only to branches in the phylogeny that lie between serotypes. We also assess the impact of the virus avidity and serum potency terms, $v_{a}$ and $p_{b}$ . For all models and metrics, we report the mean and 95% confidence interval across 100-fold Monte Carlo cross validation with random 90%:10%, training:test splits.

Model	Antigenic resolution	$v_{a}$ And $p_{b}$	RMSE	Pearson $R^{2}$
Substitution	Full	Yes	0.75 (0.74–0.77)	0.78 (0.77–0.79)
Substitution	Full	No	1.13 (1.11–1.16)	0.50 (0.48–0.52)
Substitution	Interserotype	Yes	0.86 (0.85–0.88)	0.72 (0.70–0.73)
Substitution	Interserotype	No	0.86 (0.84–0.87)	0.71 (0.70–0.72)
Tree	Full	Yes	0.84 (0.83–0.86)	0.72 (0.71–0.73)
Tree	Full	No	1.40 (1.38–1.42)	0.24 (0.23–0.26)
Tree	Interserotype	Yes	0.87 (0.85–0.88)	0.70 (0.69–0.71)
Tree	Interserotype	No	0.86 (0.84–0.88)	0.72 (0.71–0.73)

Consistent with the relatively long timescale of dengue evolution, we observe many sites in the dengue phylogeny to have mutated multiple times. These represent instances of parallelism, reversion and homoplasy. For example, we observe that site 390 is consistently S in DENV1, N in DENV3 and H in DENV4, while DENV2 genotypes show a mixture of D, N and S (Figure 2—figure supplement 1). We estimate an antigenic impact of 0.18 log $_{2}$ titers of the N390S mutation. Our model predicts that the parallel N390S mutations in DENV1 and DENV2 Cosmopolitan makes these viruses slightly more antigenically similar rather than more antigenically distinct. Along these lines, we compared the ‘substitution’ model to a similar model formulation (termed the ‘tree’ model) which assigns $d_{m}$ values to individual branches in the phylogeny, rather than to individual mutations, so that each branch with a positive $d_{m}$ value increases antigenic distance between strains (Neher et al., 2016). As expected from the high degree of homoplasy across the dengue phylogeny, we observe that the ‘substitution’ model outperforms the ‘tree’ model in predicting titers in validation datasets (Table 3).

To investigate the impact of this observed variation, we examine patterns of neutralization in response to vaccination with each monovalent component of the NIH vaccine candidate (Figure 3). Here, we see that each monovalent component elicits broad homotypic protection, but levels of heterotypic cross protection vary widely between heterotypic genotypes. This is consistent with previous reports of genotype-specific interactions between standing population immunity and subsequent heterotypic epidemics as modulating epidemic severity (OhAinle et al., 2011; Kochel et al., 2002). We hypothesize that this observed within-serotype variation primarily effects heterotypic secondary infection outcomes, rather than modulating homotypic immunity. Although we note that Juraska et al. (2018) demonstrate that vaccine efficacy decreases with increasing amino acid divergence of breakthrough infections from the vaccine insert.

Overall, we expect that these antigenic phenotypes represent a lower bound on the extent, magnitude, and nature of antigenic heterogeneity with DENV. Our current titer dataset spans the breadth of DENV diversity, but due to small sample size, it lacks the resolution to detect most sub-genotype antigenic variation. The appearance of the deep antigenic divergence of the four serotypes, and the more recent antigenic divergences within each serotype, suggest that DENV antigenic evolution is likely an ongoing, although gradual, process. We therefore expect that future studies with richer datasets will find additional antigenic variation within each genotype. This dataset also contains many left-censored titer values, where we know two viruses are at least $T$ titer units apart, but do not know exactly how far apart. If we knew the true value of these censored titers, many of them would indicate larger antigenic distances than the reported values, $T$ , which are used to train the model. Thus, it is likely that our model systematically underestimates the magnitude of titer distances.

Finally, antibody neutralization and escape (as measured by PRNT titers) is only one component of the immune response to DENV. Although analysis of a longitudinal cohort study shows that these neutralization titers correlate with protection from severe secondary infection, it is unclear how PRNT titers correspond to antibody-dependent enhancement (Katzelnick et al., 2016). It is also important to note that DENV case outcomes are partially mediated by interactions with innate and T-cell immunity, the effects of which are not captured in neutralization titers (Green et al., 2014). Overall, while richer datasets and the development of more holistic assays will be required in order to fully characterize the extent of DENV antigenic diversity, it is clear that the four-serotype model is insufficient to explain DENV antigenic evolution.

Viral clade dynamics

We use these inferred antigenic relationships to directly quantify the impact of antigenic fitness on DENV population composition. To do so, we measure serotype frequencies across Southeast Asia over time and construct a model to estimate how they will fluctuate (Materials and methods, Equation 6-16). This model places a fitness value on each serotype that derives from a constant intrinsic component alongside a time-dependent antigenic component. Antigenic fitness declines with population immunity, which is accumulated via the recent circulation of antigenically similar viruses. Our primary model parameterization assumes that both heterotypic and homotypic immunity wane linearly over time at the same rate, with homotypic immunity starting from a higher baseline of protection based on closer antigenic distances. We compared this to a secondary model parameterization with only heterotypic waning (see Materials and methods), under which we observe similar model performance (Table 3).

We find that antigenic fitness is able to explain much of the observed variation in serotype growth and decline (Figure 5). Forward simulations under the optimized parameter set display damped oscillations around the serotype-specific ‘set points’ determined by intrinsic fitnesses, but intrinsic fitness alone is unable to explain serotype fluctuations ( $R^{2} = 0.04$ ; Table 3, Figure 5—figure supplement 2). This demonstrates that although intrinsic fitness plays an important role in dictating long-term dynamics, wherein particular serotypes tend to circulate at low frequency (e.g. DENV4) and others at high frequency (e.g. DENV1 and DENV2), antigenic fitness plays out on shorter-term time scales, dictating circulation over several subsequent years.

We similarly use this model to quantify the effect of within-serotype antigenic variation on the success and decline of canonical DENV genotypes (Figure 6). As above, genotype antigenic fitness declines with population immunity. Here, we estimate population immunity based on antigenic distance from recently circulating genotypes, using distances that are either genotype-specific or based only on the serotype that each genotype belongs to. We then directly compare how strongly these coarser serotype-level versus specific genotype-level antigenic relationships impact DENV population dynamics. Overall, we find that antigenic fitness explains a moderate portion of the observed variation in genotype growth and decline. Surprisingly, however, we find that incorporating within-serotype antigenic differences does not improve our predictions (Figure 6C–D). We suggest two possible explanations for this observation.

First, it may be that although genotypes are antigenically diverse, these differences do not influence large-scale regional dynamics over time. We may then hypothesize that within-serotype antigenic heterogeneity mediates disease severity, but does not influence infection or onward transmission. This hypothesis is consistent with the findings of Nagao and Koelle (2008), who demonstrated that dengue epidemiological dynamics are compatible with a model wherein immunity confers protection against severe symptoms, but not asymptomatic infection. This is also consistent with Ten Bosch et al. (2018)’s findings that asymptomatic dengue infections contribute to onward transmission.

Alternatively, this lack of signal could be methodologically explained if either (A) genotype-level frequency trajectories estimated from public data are overly noisy for this application or (B) our model of antigenic fitness based on PRNT assay data does not match reality, due to either PRNT assay data not well reflecting human immunity or due to our particular model formulation that parameterizes immunity from titer distances (Equation 6-10). In the present analysis, we are not able to firmly resolve these disparate possibilities.

These observations are also subject to caveats imposed by the available data and model assumptions. Our estimates of antigenic fitness are informed by the antigenic distances inferred by the substitution model; thus, as above, we are unable to account for nuanced antigenic differences between sub-genotype clades of DENV due to limited titer data. We estimate DENV population composition over time based on available sequence data, pooled across all of Southeast Asia (Materials and methods, Equation 4). As the vast majority of cases of DENV are asymptomatic, sequenced viruses likely represent a biased sample of more severe cases from urban centers where patients are more likely to seek and access care. We also assume that Southeast Asia represents a closed viral population with homogeneous mixing. However, increasing globalization likely results in some amount of viral importation that is not accounted for in this model (Allicock et al., 2012). Finally, although Southeast Asia experiences hyperendemic DENV circulation, the majority of DENV transmissions are hyper-local (Salje et al., 2017), and viral populations across this broad region may not mix homogeneously each season. Thus, it is possible that these sub-serotype antigenic differences impact finer-scale population dynamics, but we lack the requisite data to examine this hypothesis.

Conclusions

We find that within-serotype antigenic evolution helps explain observed patterns of cross-neutralization among dengue genotypes. We also find that serotype-level population immunity is a strong determinant of viral clade dynamics across Southeast Asia. As richer datasets become available, future studies that similarly combine viral genomics, functional antigenic characterization, and population modeling have great potential to improve our understanding of how DENV evolves antigenically and moves through populations.

Model sharing and extensions

We have provided all source code, configuration files and datasets at https://github.com/blab/dengue-antigenic-dynamics (copy archived at https://github.com/elifesciences-publications/dengue-antigenic-dynamics), and wholeheartedly encourage other groups to adapt and extend this framework for further investigation of DENV antigenic evolution and population dynamics.

Materials and methods

Data

Titers

Request a detailed protocol

Antigenic distance between pairs of viruses $i$ and $j$ is experimentally measured using a neutralization titer, which measures how well serum drawn after infection with virus $i$ is able to neutralize virus $j$ in vitro (Russell and Nisalak, 1967). Briefly, two-fold serial dilutions of serum $i$ are incubated with a fixed concentration of virus $j$ . Titers represent the lowest serum concentration able to neutralize 50% of virus, and are reported as the inverse dilution. We used two publicly available plaque reduction neutralization titer (PRNT50) datasets generated by Katzelnick et al. (2015). The primary dataset was generated by infecting each of 36 non-human primates with a unique strain of DENV. NHP sera was drawn after 12 weeks and titered against the panel of DENV viruses. The secondary dataset was generated by vaccinating 31 human trial participants with a monovalent component of the NIH DENV vaccine. Sera was drawn after 6 weeks and titered against the same panel of DENV viruses. As discussed in Katzelnick et al., these two datasets show similar patterns of antigenic relationships between DENV viruses. In total, our dataset includes 1182 measurements across 48 virus strains: 36 of these were used to generate serum, and 47 were used as test viruses.

Sequences

Request a detailed protocol

For the titer model analysis, we used the full sequence of E (envelope) from the 48 strains in the titer dataset.

For the clade frequencies analysis, we downloaded all dengue genome sequences available from the Los Alamos National Lab Hemorrhagic Fever Virus Database as of March 7, 2018, that contained at least the full coding sequence of E (envelope) (total N = 12,645) (Kuiken et al., 2012). We discarded sequences which were putative recombinants, duplicates, lab strains, or which lacked an annotated sampling location and/or sampling date. We selected all remaining virus strains that were annotated as a Southeast Asian isolate (total N = 8,644).

For both datasets, we used the annotated reference dataset from Pyke et al. (2016) to assign sequences to canonical genotypes.

Alignments and trees

Request a detailed protocol

We used MAFFT v7.305b to align nucleotide E gene sequences for each strain before translating the aligned sequences (no frame-shift indels were present) (Katoh and Standley, 2013). All maximum likelihood phylogenies were constructed with IQ-TREE version 1.6.8 and the GTR + I + G15 nucleotide substitution model (Nguyen et al., 2015).

Titer model

Request a detailed protocol

We compute standardized antigenic distance between virus $i$ and serum $j$ (denoted $D_{i j}$ ) from measured titers $T_{i j}$ relative to autologous titers $T_{i i}$ , such that

D_{i j} = \log_{2} (T_{i i}) - \log_{2} (T_{i j}) .

We then average normalized titers across individuals. To predict unmeasured titers, we employ the ‘substitution model’ from Neher et al. (2016) and implemented in Nextstrain (Hadfield et al., 2018), which assumes that antigenic evolution is driven by underlying genetic evolution.

In the substitution model, observed titer drops are mapped to mutations between each serum and test virus strain after correcting for overall virus avidity, $v_{i}$ , and serum potency, $p_{j}$ (‘row’ and ‘column’ effects, respectively), so that

D_{i j} \approx {\hat{D}}_{i j} = \sum_{m} d_{m} + v_{i} + p_{j},

where $d_{m}$ is the titer drop assigned to each mutation, $m$ , between serum $i$ and virus $j$ , and $m$ iterates over mutations. We randomly withhold 10% of titer measurements as a test set. We use the remaining 90% of titer measurements as a training set to learn values for virus avidity, serum potency, and mutation effects. As in Neher et al. (2016), we formulate this as a convex optimization problem and solve for these parameter values to minimize the cost function

C = \sum_{i, j} ({\hat{D}}_{i j} - D_{i j})^{2} + λ \sum_{m} d_{m} + κ \sum_{i} v_{i}^{2} + δ \sum_{j} p_{j}^{2} .

We used $λ = 3.0$ , $κ = 0.6$ , and $δ = 1.2$ to minimize test error. Respectively, these terms represent the squared training error; an L1 regularization term on mutation effects, such that most values of $d_{m} = 0$ ; and L2 regularization terms on virus avidities and serum potencies, such that they are normally distributed. These parameter values are then used to predict the antigenic distance between all pairs of viruses, $i$ and $j$ . We assess performance by comparing predicted to known titer values in our test data set, and present test error (aggregated from 100-fold Monte Carlo cross-validation) throughout the manuscript.

Viral clade dynamics

Empirical clade frequencies

Request a detailed protocol

As discussed in Neher et al. (2016) and Lee et al. (2018), we estimate empirical clade frequencies from 1970 to 2015 based on observed relative abundances of each clade in the ‘slice’ of the phylogeny corresponding to each quarterly timepoint.

Briefly, the frequency trajectory of each clade in the phylogeny is modeled according to a Brownian motion diffusion process discretized to 3-month intervals. Relative to a simple Brownian motion, the expectation includes an ‘inertia’ term that adds velocity to the diffusion and the variance includes a term $x (1 - x)$ to scale variance according to frequency following a Wright-Fisher population genetic process. This results in the diffusion process

x (t + d t) = 𝒩 (x (t) + ϵ d x, d t σ^{2} x (t) (1 - x (t)))

with ‘volatility’ parameter $σ^{2}$ and inertia parameter $ϵ$ . The term $d x$ is the increment in the previous timestep, so that $d x = x (t) - x (t - d t)$ . We used $ϵ = 0.7$ and $σ = 2.0$ to maximize fit to empirical trajectory behavior.

We also include an Bernoulli observation model for clade presence/absence among sampled viruses at timestep $t$ . This observation model follows

f (x, t) = \prod_{v \in V} x (t) \prod_{v \notin V} (1 - x (t)),

where $v \in V$ represents the set of viruses that belong to the clade and $v \notin V$ represents the set of viruses that do not belong to the clade. Each frequency trajectory is estimated by simultaneously maximizing the likelihood of the process model and the likelihood of the observation model via adjusting frequency trajectory $\vec{x} = (x_{1}, \dots x_{n})$ .

Population immunity

Request a detailed protocol

For antigenically diverse pathogens, antigenic novelty represents a fitness advantage (Lipsitch and O'Hagan, 2007). This means that viruses that are antigenically distinct from previously circulating viruses are able to access more susceptible hosts, allowing the antigenically novel lineage to expand. We adapt a simple deterministic model from Luksza and Lässig (2014) to directly quantify dengue antigenic novelty and its impact on viral fitness. We quantify population immunity to virus $i$ at time $t$ , $P_{i} (t)$ , as a function of which clades have recently circulated in the past $N$ years, and how antigenically similar each of these clades is to virus $i$ , so that

P_{i} (t) = \sum_{n = 1}^{n = N} (w (n) \sum_{j} (x_{j} (t - n) C (D_{i j}))),

where $D_{i j}$ is the antigenic distance between $i$ and each non-overlapping clade $j$ , $n$ is the number of years since exposure, and $x_{j} (t - n)$ is the relative frequency of $j$ at year $t - n$ . Waning immunity is modeled as a non-negative linear function of time following

w (n) = \max (1 - γ n, 0) .

The relationship between antigenic distance and the probability of protection, $C$ , is also assumed to be non-negative and linear with slope $- σ$ , such that

C (D_{i j}) = \max (1 - σ D_{i j}, 0) .

In addition to this primary analysis, we conducted a secondary analysis with a different parameterization of immunity that removes waning of homotypic immunity while allowing waning of heterotypic immunity. In this case, we assume the relationship between antigenic distance and the probability of protection, $C$ , to be 50% at antigenic distance $1 / σ$ and to wane based on years since infection $n$ modified by $γ_{het}$ following

C (D_{i j}, n) = \exp (- σ {(1 / γ_{het})}^{n} D_{i j}) .

We model the effects of population immunity, $P_{i} (t)$ , on viral antigenic fitness, $f_{i} (t)$ , as

f_{i} (t) = f_{0} - β P_{i} (t),

where $β$ and $f_{0}$ are fit parameters representing the slope of the linear relationship between immunity and fitness, and the intrinsic relative fitness of each serotype, respectively.

Frequency predictions

Request a detailed protocol

Similar to the model implemented in Luksza and Lässig (2014), we estimate predicted clade frequencies at time $t + d t$ as

\hat{x_{i}} (t + d t) = \frac{x_{i} (t) e^{f_{i} (t) d t}}{\sum_{i} x_{i} (t) e^{f_{i} (t) d t}}

for short-term predictions (where $d t < 1$ year).

We do not attempt to predict future frequencies for clades with $x_{i} (t) < 0.05$ .

For long-term predictions, we must account for immunity accrued at each intermediate timepoint between $t$ and $d t$ . We divide the interval between $t$ and $d t$ into 3 month timepoints, $[t + u, t + 2 u, \dots, t + U]$ , such that $U = d t$ . We then compound immunity based on predicted clade frequencies at each intermediate timepoint following

\hat{x_{i}} (t + u) = x_{i} (t) e^{f_{i} (t) u}

\hat{x_{i}} (t + 2 u) = \hat{x_{i}} (t + u) e^{f_{i} (t + u) u}

\dots

\hat{x_{i}} (t + U) = x_{i} (t) e^{f_{i} (t) u} e^{f_{i} (t + u) u} e^{f_{i} (t + 2 u) u} \dots e^{f_{i} (t + U) u}

\hat{x_{i}} (t + d t) = \hat{x_{i}} (t + U) = x_{i} (t) e^{\sum_{u} f_{i} (t + u) u}

We then calculate clade growth rates, defined as the fold-change in relative clade frequency between time $t$ and time $t + d t$

\frac{\hat{x_{i}} (t + d t)}{x_{i} (t)} .

Null models

Request a detailed protocol

To quantify the impact of antigenic fitness on DENV clade success, we compare our antigenically-informed model to two null models.

Under the ‘equal fitness null’ model, all viruses have equal total fitness (antigenic and intrinsic fitness) at all timepoints

f_{i}^{e q u a l} (t) = 0

{\hat{x_{i}}}^{e q u a l} (t + d t) = x_{i} (t) e^{0} = x_{i} (t) .

Under the ‘intrinsic fitness null’ model, all viruses have equal antigenic fitness but serotype-specific intrinsic fitness at all timepoints

f_{i}^{i n t r i n s i c} (t) = f_{0}

{\hat{x_{i}}}^{i n t r i n s i c} (t + d t) = x_{i} (t) e^{f_{0}} .

Model performance assessment and parameter fitting

Request a detailed protocol

We assess predictive power as the root mean squared error between predicted and empirical clade frequencies. To assess both the final frequency predictions and the predicted clade trajectories, this RMSE includes error for each clade, for each starting timepoint $t$ , and for each intermediate predicted timepoint $t + u$ .

Our frequency prediction model has a total of six free parameters. We jointly fit these parameters to minimize RMSE of serotype frequency predictions via the Nelder-Mead algorithm as implemented in SciPy v.1.0.0 (Table 2) (Jones et al., 2001; Gao and Han, 2012). We use $N = 2$ years of previous immunity that contribute to antigenic fitness and project $d t = 2$ years in the future when predicting clade frequencies.

Simulations

Request a detailed protocol

To ensure the model machinery functions correctly, we seeded a forward simulation of clade dynamics with 2 years of empirical frequencies and simulated predicted dynamics over the remainder of the time course (Figure 5—figure supplement 3). We then fit model parameters as described above, and obtained parameter values that well recover input values (Table 5).

Table 5

Parameter recovery against simulated data.

Parameter	Input value	Optimized value
$β$	3.25	3.10
$γ$	0.55	0.56
$σ$	2.35	2.57
$f_{0}^{(1)}$	0.70	0.72
$f_{0}^{(2)}$	0.85	0.78
$f_{0}^{(3)}$	0.40	0.41

Data and software availability

Request a detailed protocol

Sequence and titer data, as well as all source code used for analyses and figure generation, is publicly available at https://github.com/blab/dengue-antigenic-dynamics (copy archived at https://github.com/elifesciences-publications/dengue-antigenic-dynamics). Our work relies upon many open source Python packages and software tools, including iPython (Perez and Granger, 2007), Matplotlib (Hunter, 2007), Seaborn (Waskom, 2017), Pandas (McKinney, 2010), CVXOPT (Andersen et al., 2013), NumPy (van der Walt et al., 2011; Gao and Han, 2012), Biopython (Cock et al., 2009), SciPy (Jones et al., 2001), Statsmodels (Seabold and Perktold, 2010), Nextstrain (Hadfield et al., 2018), MAFFT (Katoh and Standley, 2013), and IQ-TREE (Nguyen et al., 2015). Package versions are documented in the GitHub repository.

Data availability

All data, code, model implementations, analyses and figures are available via our online repository at github.com/blab/dengue-antigenic-dynamics (copy archived athttps://github.com/elifesciences-publications/dengue-antigenic-dynamics).

References

1. Allicock OM
2. Lemey P
3. Tatem AJ
4. Pybus OG
5. Bennett SN
6. Mueller BA
7. Suchard MA
8. Foster JE
9. Rambaut A
10. Carrington CV
(2012) Phylogeography and population dynamics of dengue viruses in the americas
Molecular Biology and Evolution 29:1533–1543.

https://doi.org/10.1093/molbev/msr320
- PubMed
- Google Scholar
Software
(2013)
CVXOPT: a Python package for convex optimization

Abel Ee Ucla Edu/cvxopt.
(2011) Strength and tempo of selection revealed in viral gene genealogies
BMC Evolutionary Biology 11:220.

https://doi.org/10.1186/1471-2148-11-220
- PubMed
- Google Scholar
(2012) Canalization of the evolutionary trajectory of the human influenza virus
BMC Biology 10:38.

https://doi.org/10.1186/1741-7007-10-38
- PubMed
- Google Scholar
1. Bhatt S
2. Gething PW
3. Brady OJ
4. Messina JP
5. Farlow AW
6. Moyes CL
7. Drake JM
8. Brownstein JS
9. Hoen AG
10. Sankoh O
11. Myers MF
12. George DB
13. Jaenisch T
14. Wint GR
15. Simmons CP
16. Scott TW
17. Farrar JJ
18. Hay SI
(2013) The global distribution and burden of dengue
Nature 496:504–507.

https://doi.org/10.1038/nature12060
- PubMed
- Google Scholar
1. Cock PJ
2. Antao T
3. Chang JT
4. Chapman BA
5. Cox CJ
6. Dalke A
7. Friedberg I
8. Hamelryck T
9. Kauff F
10. Wilczynski B
11. de Hoon MJ
(2009) Biopython: freely available Python tools for computational molecular biology and bioinformatics
Bioinformatics 25:1422–1423.

https://doi.org/10.1093/bioinformatics/btp163
- PubMed
- Google Scholar
1. de Alwis R
2. Williams KL
3. Schmid MA
4. Lai CY
5. Patel B
6. Smith SA
7. Crowe JE
8. Wang WK
9. Harris E
10. de Silva AM
(2014) Dengue viruses are enhanced by distinct populations of serotype cross-reactive antibodies in human immune sera
PLOS Pathogens 10:e1004386.

https://doi.org/10.1371/journal.ppat.1004386
- PubMed
- Google Scholar
1. Forshey BM
2. Reiner RC
3. Olkowski S
4. Morrison AC
5. Espinoza A
6. Long KC
7. Vilcarromero S
8. Casanova W
9. Wearing HJ
10. Halsey ES
11. Kochel TJ
12. Scott TW
13. Stoddard ST
(2016) Incomplete protection against dengue virus type 2 Re-infection in peru
PLOS Neglected Tropical Diseases 10:e0004398.

https://doi.org/10.1371/journal.pntd.0004398
- PubMed
- Google Scholar
1. Gao F
2. Han L
(2012) Implementing the Nelder-Mead simplex algorithm with adaptive parameters
Computational Optimization and Applications 51:259–277.

https://doi.org/10.1007/s10589-010-9329-3
- Google Scholar
(1982) Identification of distinct antigenic determinants on dengue-2 virus using monoclonal antibodies
The American Journal of Tropical Medicine and Hygiene 31:548–555.

https://doi.org/10.4269/ajtmh.1982.31.548
- PubMed
- Google Scholar
(2007) Analysis of repeat hospital admissions for dengue to estimate the frequency of third or fourth dengue infections resulting in admissions and dengue hemorrhagic fever, and serotype sequences
The American Journal of Tropical Medicine and Hygiene 77:910–913.

https://doi.org/10.4269/ajtmh.2007.77.910
- PubMed
- Google Scholar
(2014) Innate immunity to dengue virus infection and subversion of antiviral responses
Journal of Molecular Biology 426:1148–1160.

https://doi.org/10.1016/j.jmb.2013.11.023
- PubMed
- Google Scholar
(1998) Chaos, persistence, and evolution of strain structure in antigenically diverse infectious agents
Science 280:912–915.

https://doi.org/10.1126/science.280.5365.912
- PubMed
- Google Scholar
1. Hadfield J
2. Megill C
3. Bell SM
4. Huddleston J
5. Potter B
6. Callender C
7. Sagulenko P
8. Bedford T
9. Neher RA
(2018) Nextstrain: real-time tracking of pathogen evolution
Bioinformatics 34:4121–4123.

https://doi.org/10.1093/bioinformatics/bty407
- PubMed
- Google Scholar
1. Halstead SB
(1979) In vivo enhancement of dengue virus infection in rhesus monkeys by passively transferred antibody
Journal of Infectious Diseases 140:527–533.

https://doi.org/10.1093/infdis/140.4.527
- PubMed
- Google Scholar
1. Holmes E
2. Twiddy S
(2003) The origin, emergence and evolutionary genetics of dengue virus
Infection, Genetics and Evolution 3:19–28.

https://doi.org/10.1016/S1567-1348(03)00004-2
- Google Scholar
1. Hunter JD
(2007) Matplotlib: a 2D graphics environment
Computing in Science & Engineering 9:90–95.

https://doi.org/10.1109/MCSE.2007.55
- Google Scholar
(2001) SciPy: Open source scientific tools for Python
SciPy: Open source scientific tools for Python, http://www.scipy.org/.

http://www.scipy.org/
- Google Scholar
1. Juraska M
2. Magaret CA
3. Shao J
4. Carpp LN
5. Fiore-Gartland AJ
6. Benkeser D
7. Girerd-Chambaz Y
8. Langevin E
9. Frago C
10. Guy B
11. Jackson N
12. Duong Thi Hue K
13. Simmons CP
14. Edlefsen PT
15. Gilbert PB
(2018) Viral genetic diversity and protective efficacy of a tetravalent dengue vaccine in two phase 3 trials
PNAS 115:E8378–E8387.

https://doi.org/10.1073/pnas.1714250115
- PubMed
- Google Scholar
1. Katoh K
2. Standley DM
(2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability
Molecular Biology and Evolution 30:772–780.

https://doi.org/10.1093/molbev/mst010
- PubMed
- Google Scholar
1. Katzelnick LC
2. Fonville JM
3. Gromowski GD
4. Bustos Arriaga J
5. Green A
6. James SL
7. Lau L
8. Montoya M
9. Wang C
10. VanBlargan LA
11. Russell CA
12. Thu HM
13. Pierson TC
14. Buchy P
15. Aaskov JG
16. Muñoz-Jordán JL
17. Vasilakis N
18. Gibbons RV
19. Tesh RB
20. Osterhaus AD
21. Fouchier RA
22. Durbin A
23. Simmons CP
24. Holmes EC
25. Harris E
26. Whitehead SS
27. Smith DJ
(2015) Dengue viruses cluster antigenically but not as discrete serotypes
Science 349:1338–1343.

https://doi.org/10.1126/science.aac5017
- PubMed
- Google Scholar
(2016) Neutralizing antibody titers against dengue virus correlate with protection from symptomatic infection in a longitudinal cohort
PNAS 113:728–733.

https://doi.org/10.1073/pnas.1522136113
- PubMed
- Google Scholar
1. Katzelnick LC
2. Gresh L
3. Halloran ME
4. Mercado JC
5. Kuan G
6. Gordon A
7. Balmaseda A
8. Harris E
(2017) Antibody-dependent enhancement of severe dengue disease in humans
Science 358:929–932.

https://doi.org/10.1126/science.aan6836
- PubMed
- Google Scholar
1. Kochel TJ
2. Watts DM
3. Halstead SB
4. Hayes CG
5. Espinoza A
6. Felices V
7. Caceda R
8. Bautista CT
9. Montoya Y
10. Douglas S
11. Russell KL
(2002) Effect of dengue-1 antibodies on american dengue-2 viral infection and dengue haemorrhagic fever
The Lancet 360:310–312.

https://doi.org/10.1016/S0140-6736(02)09522-3
- Google Scholar
(2012) The LANL hemorrhagic fever virus database, a new platform for analyzing biothreat viruses
Nucleic Acids Research 40:D587–D592.

https://doi.org/10.1093/nar/gkr898
- PubMed
- Google Scholar
(1997) Molecular evolution and phylogeny of dengue-4 viruses
Journal of General Virology 78:2279–2284.

https://doi.org/10.1099/0022-1317-78-9-2279
- PubMed
- Google Scholar
1. Lee JM
2. Huddleston J
3. Doud MB
4. Hooper KA
5. Wu NC
6. Bedford T
7. Bloom JD
(2018) Deep mutational scanning of hemagglutinin helps predict evolutionary fates of human H3N2 influenza variants
PNAS 115:E8276–E8285.

https://doi.org/10.1073/pnas.1806133115
- PubMed
- Google Scholar
1. Lipsitch M
2. O'Hagan JJ
(2007) Patterns of antigenic diversity and the mechanisms that maintain them
Journal of the Royal Society Interface 4:787–802.

https://doi.org/10.1098/rsif.2007.0229
- Google Scholar
1. Lourenço J
2. Tennant W
3. Faria NR
4. Walker A
5. Gupta S
6. Recker M
(2018) Challenges in dengue research: a computational perspective
Evolutionary Applications 11:516–533.

https://doi.org/10.1111/eva.12554
- PubMed
- Google Scholar
1. Lourenço J
2. Recker M
(2013) Natural, persistent oscillations in a spatial multi-strain disease system with application to dengue
PLOS Computational Biology 9:e1003308.

https://doi.org/10.1371/journal.pcbi.1003308
- PubMed
- Google Scholar
1. Luksza M
2. Lässig M
(2014) A predictive fitness model for influenza
Nature 507:57–61.

https://doi.org/10.1038/nature13087
- PubMed
- Google Scholar
Conference
1. McKinney W
(2010)
Data structures for statistical computing in Python

Proceedings of the 9th Python in Science Conference. pp. 51–56.
- Google Scholar
(2014)
On the risk of severe dengue during secondary infection: a systematic review coupled with mathematical modeling

Journal of Vector Borne Diseases 51:153.
- PubMed
- Google Scholar
1. Nagao Y
2. Koelle K
(2008) Decreases in dengue transmission may act to increase the incidence of dengue hemorrhagic fever
PNAS 105:2238–2243.

https://doi.org/10.1073/pnas.0709029105
- PubMed
- Google Scholar
(2016) Prediction, dynamics, and visualization of antigenic phenotypes of seasonal influenza viruses
PNAS 113:E1701–E1709.

https://doi.org/10.1073/pnas.1525578113
- PubMed
- Google Scholar
(2015) IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies
Molecular Biology and Evolution 32:268–274.

https://doi.org/10.1093/molbev/msu300
- PubMed
- Google Scholar
1. OhAinle M
2. Balmaseda A
3. Macalalad AR
4. Tellez Y
5. Zody MC
6. Saborío S
7. Nuñez A
8. Lennon NJ
9. Birren BW
10. Gordon A
11. Henn MR
12. Harris E
(2011) Dynamics of dengue disease severity determined by the interplay between viral genetics and serotype-specific immunity
Science Translational Medicine 3:114ra128.

https://doi.org/10.1126/scitranslmed.3003084
- PubMed
- Google Scholar
1. Olkowski S
2. Forshey BM
3. Morrison AC
4. Rocha C
5. Vilcarromero S
6. Halsey ES
7. Kochel TJ
8. Scott TW
9. Stoddard ST
(2013) Reduced risk of disease during postsecondary dengue virus infections
The Journal of Infectious Diseases 208:1026–1033.

https://doi.org/10.1093/infdis/jit273
- PubMed
- Google Scholar
1. Perez F
2. Granger BE
(2007) IPython: a system for interactive scientific computing
Computing in Science & Engineering 9:21–29.

https://doi.org/10.1109/MCSE.2007.53
- Google Scholar
(2016) Highly divergent dengue virus type 1 genotype sets a new distance record
Scientific Reports 6:22356.

https://doi.org/10.1038/srep22356
- PubMed
- Google Scholar
1. Reich NG
2. Shrestha S
3. King AA
4. Rohani P
5. Lessler J
6. Kalayanarooj S
7. Yoon I-K
8. Gibbons RV
9. Burke DS
10. Cummings DAT
(2013) Interactions between serotypes of dengue highlight epidemiological impact of cross-immunity
Journal of the Royal Society Interface 10:20130414.

https://doi.org/10.1098/rsif.2013.0414
- Google Scholar
1. Rico-Hesse R
(1990) Molecular evolution and distribution of dengue viruses type 1 and 2 in nature
Virology 174:479–493.

https://doi.org/10.1016/0042-6822(90)90102-W
- PubMed
- Google Scholar
1. Russell PK
2. Nisalak A
(1967)
Dengue virus identification by the plaque reduction neutralization test

Journal of Immunology 99:291–296.
- PubMed
- Google Scholar
1. Sabin AB
(1952) Research on dengue during world war II
The American Journal of Tropical Medicine and Hygiene 1:30–50.

https://doi.org/10.4269/ajtmh.1952.1.30
- PubMed
- Google Scholar
(2017) Dengue diversity across spatial and temporal scales: local structure and the effect of host population size
Science 355:1302–1306.

https://doi.org/10.1126/science.aaj9384
- PubMed
- Google Scholar
1. Salje H
2. Cummings DAT
3. Rodriguez-Barraquer I
4. Katzelnick LC
5. Lessler J
6. Klungthong C
7. Thaisomboonsuk B
8. Nisalak A
9. Weg A
10. Ellison D
11. Macareo L
12. Yoon IK
13. Jarman R
14. Thomas S
15. Rothman AL
16. Endy T
17. Cauchemez S
(2018) Reconstruction of antibody dynamics and infection histories to evaluate dengue risk
Nature 557:719–723.

https://doi.org/10.1038/s41586-018-0157-4
- PubMed
- Google Scholar
(1984) Risk factors in dengue shock syndrome: a prospective epidemiologic study in Rayong, Thailand. I. the 1980 outbreak
American Journal of Epidemiology 120:653–669.

https://doi.org/10.1093/oxfordjournals.aje.a113932
- PubMed
- Google Scholar
Conference
1. Seabold S
2. Perktold J
(2010)
Statsmodels: econometric and statistical modeling with Python

9th Python in Science Conference.
- Google Scholar
1. Stanaway JD
2. Shepard DS
3. Undurraga EA
4. Halasa YA
5. Coffeng LE
6. Brady OJ
7. Hay SI
8. Bedi N
9. Bensenor IM
10. Castañeda-Orjuela CA
11. Chuang TW
12. Gibney KB
13. Memish ZA
14. Rafay A
15. Ukwaja KN
16. Yonemoto N
17. Murray CJL
(2016) The global burden of dengue: an analysis from the global burden of disease study 2013
The Lancet Infectious Diseases 16:712–723.

https://doi.org/10.1016/S1473-3099(16)00026-8
- PubMed
- Google Scholar
1. Ten Bosch QA
2. Clapham HE
3. Lambrechts L
4. Duong V
5. Buchy P
6. Althouse BM
7. Lloyd AL
8. Waller LA
9. Morrison AC
10. Kitron U
11. Vazquez-Prokopec GM
12. Scott TW
13. Perkins TA
(2018) Contributions from the silent majority dominate dengue virus transmission
PLOS Pathogens 14:e1006965.

https://doi.org/10.1371/journal.ppat.1006965
- PubMed
- Google Scholar
1. Tibshirani R
(1996) Regression shrinkage and selection via the lasso
Journal of the Royal Statistical Society: Series B (Methodological) 58:267–288.

https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
- Google Scholar
(2003) Inferring the rate and time-scale of dengue virus evolution
Molecular Biology and Evolution 20:122–129.

https://doi.org/10.1093/molbev/msg010
- PubMed
- Google Scholar
(2011) The NumPy array: a structure for efficient numerical computation
Computing in Science & Engineering 13:22–30.

https://doi.org/10.1109/MCSE.2011.37
- Google Scholar
1. Waggoner JJ
2. Balmaseda A
3. Gresh L
4. Sahoo MK
5. Montoya M
6. Wang C
7. Abeynayake J
8. Kuan G
9. Pinsky BA
10. Harris E
(2016) Homotypic dengue virus reinfections in nicaraguan children
Journal of Infectious Diseases 214:986–993.

https://doi.org/10.1093/infdis/jiw099
- PubMed
- Google Scholar
Software
1. Waskom M
(2017) Seaborn Visualization Library
Seaborn Visualization Library.

http://seaborn.pydata.org/
1. Wearing HJ
2. Rohani P
(2006) Ecological and immunological determinants of dengue epidemics
PNAS 103:11802–11807.

https://doi.org/10.1073/pnas.0602960103
- PubMed
- Google Scholar
(2005) Clade replacements in dengue virus serotypes 1 and 3 are associated with changing serotype prevalence
Journal of Virology 79:15123–15130.

https://doi.org/10.1128/JVI.79.24.15123-15130.2005
- PubMed
- Google Scholar

Article and author information

Author details

Sidney M Bell
1. Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, United States
2. Molecular and Cell Biology Program, University of Washington, Seattle, United States
Contribution
Data curation, Software, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Methodology, Writing—original draft, Writing—review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0003-1933-6033
Leah Katzelnick
1. Division of Infectious Diseases and Vaccinology, University of California, Berkeley, Berkeley, United States
2. Department of Biology, University of Florida, Gainesville, United States
Contribution
Data curation, Methodology, Writing—review and editing

Competing interests
No competing interests declared
Trevor Bedford

Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, United States

Contribution
Conceptualization, Resources, Software, Formal analysis, Supervision, Funding acquisition, Investigation, Methodology, Writing—review and editing

For correspondence
tbedford@fredhutch.org

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-4039-5794

Funding

National Science Foundation (DGE-1256082)

Sidney M Bell

Pew Charitable Trusts

Trevor Bedford

National Institute of General Medical Sciences (R35GM119774-01)

Trevor Bedford

National Institute of Allergy and Infectious Diseases (R01AI114703-01)

Leah Katzelnick

National Institute of Allergy and Infectious Diseases (P01AI106695)

Leah Katzelnick

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We thank Richard Neher, John Huddleston, Andrew Rambaut, Molly OhAinle, David Shaw, Paul Edlefsen, Michal Juraska, and all members of the Bedford Lab for useful discussion and advice. SB is a Graduate Research Fellow and is supported by NSF DGE-1256082. TB is a Pew Biomedical Scholar and is supported by NIH R35 GM119774-01. LK is supported by NIH awards R01AI114703-01 and P01AI106695. Our work depends on open data sharing and many open source software tools. We gratefully acknowledge the authors and developers who make our work possible.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.