Introduction

Infection or vaccination with influenza virus elicits a neutralizing antibody response targeting the viral hemagglutinin (HA) protein1,2. These antibodies correlate with protection against infection and so provide substantial immunity to strains that they neutralize111. However, the HA of human influenza evolves rapidly, acquiring mutations that erode neutralization by antibodies elicited by prior infections and vaccinations1214. New HA variants with reduced neutralization are generally the most evolutionarily successful, and repeatedly replace the current dominant variant(s) in a process known as antigenic drift13,1517. As a result, people are reinfected roughly every five years1,18,19 and vaccines are updated annually to attempt to match the currently dominant influenza strains20.

The human antibody response to influenza is shaped in part by past exposures to related strains, a phenomenon known as imprinting2123. While an individual’s exposure history is partially dependent on their birth year19,24,25, HA’s rapid evolution along with variation in which strains infect even individuals with similar birth years create heterogeneous exposure histories both between and within birth cohorts2,22,2628. These heterogeneous exposure histories lead to substantial differences in how the neutralizing antibodies of different people target HA, and hence how their neutralizing antibody immunity is affected by viral mutations26,29,30.

Determining how population heterogeneity in human neutralizing antibody specificities shapes influenza virus evolution has been challenging because conventional methods (e.g., hemagglutination inhibition31 and neutralization assays32) are low throughput. It is therefore experimentally daunting to use these methods to measure neutralizing titers for large numbers of human sera against the full diversity of influenza virus strains that circulate in a single season. In part because of these limitations, a common approach is to use sera from singly infected ferrets to estimate antigenic distances between different viral strains13,33. However, growing evidence indicates sera from singly infected ferrets are an imperfect proxy for human populations with complex and heterogeneous exposure histories23,26,27,30,3437. Therefore, the reports from the bi-annual influenza vaccine-strain selection meetings have increasingly referenced titer measurements for human sera, made using either individual sera or serum pools3842.

Here, we use a new sequencing-based assay43 to measure the neutralization titers of 78 recent H3N2 viral strains by a large set of children and adult sera. These measurements quantify the heterogeneity of neutralizing antibody immunity to influenza across different members of the population. We find that the evolutionary success of different H3N2 strains is highly correlated with the fraction of sera that have low titers against each strain, suggesting that large-scale sequencing-based neutralization assays can help inform understanding of influenza virus evolution.

Results

A sequencing-based assay to measure neutralization titers against H3N2 influenza strains

To measure serum-neutralization titers against a large number of human influenza virus HA strains, we utilized a recently developed sequencing-based assay43. In this approach, influenza viruses are created encoding different HA strains, each tagged with an identifying nucleotide barcode29,43,44 (Figure 1a and Supplemental Figure 1). The barcoded viral variants are then pooled and assayed in an experimental format similar to a traditional neutralization assay (Figure 1b) but with the infectivity of each variant quantified by Illumina sequencing of the viral barcodes (Figure 1c). This sequencing-based readout enabled us to measure 1,872 neutralization curves per 96-well plate (triplicate measurements for 78 viruses against 8 sera).

Overview of sequencing-based neutralization assay.

(a) We generate a library of barcoded influenza viruses carrying different HAs, each identified by a unique 16-nucleotide barcode in its genome. See Supplemental Figure 1 for details. (b) The virus library is incubated with sera and then added to MDCK-SIAT1 cells in a 96-well plate. At 12-14 hours post infection, cells are lysed and a known concentration of barcoded RNA spike-in is added to each well. (c) The percent infectivity of each viral variant at each serum concentration is calculated by determining the barcode counts by sequencing. Viral barcode counts are normalized by dividing the viral barcode counts by the barcoded RNA spike-in counts. Percent infectivities are calculated by dividing the normalized barcode counts for each serum concentration by those in the no-serum control wells. These percent infectivities are used to fit neutralization curves and determine the neutralization titer, which is defined as the reciprocal of the serum dilution at which 50% of the virus is neutralized. Note that since each HA is associated with three different barcodes, each neutralization titer is measured in triplicate; we report the median across the three replicates.

We designed a virus library in November of 2023 with the goal of covering the diversity of H3N2 HA sequences present at the beginning of the 2023 Northern Hemisphere season. For the library, we selected HA haplotypes with high frequencies or a substantial number of descendants within a 12-month window preceding November 2023. This process identified 62 HA haplotypes; for each haplotype, we selected a representative naturally-occuring strain HA for inclusion in the library. These strains are widely distributed across the branches of the Nextstrain two-year H3N2 seasonal influenza phylogenetic tree from late 2023 (Figure 2a). We also supplemented the library with the HAs from Northern Hemisphere egg- and cell-passaged vaccine strains from all seasons between 2014 and 2024 (Figure 2b and Supplemental Table 1).

The library covers most H3N2 HA diversity in 2023.

(a) Phylogenetic tree of HAs from recent strains in the library, with 2023-circulating strains in blue, cell-passaged vaccine strains in white and egg-passaged vaccine strains in yellow. The cell-passaged vaccine strain A/Massachusetts/18/2022 is identical on the amino-acid level to a strain circulating in 2023, and so is also classified as a 2023-circulating strain in our analyses. See https://nextstrain.org/groups/jbloomlab/seqneut/h3n2-ha-2023-2024 for an interactive view of the full library on a background of the 2-year Nextstrain tree available in November 2023. (b) Phylogenetic tree of HAs from cell- and egg-passaged strains corresponding to H3N2 component of the Northern Hemisphere influenza vaccine from 2014 to 2024 (Supplemental Table 1). Cell-passaged vaccine strains are in white and egg-passaged vaccine strains are in yellow. Virus with the HA from the egg-passaged A/Kansas/14/2017 strain could not be grown in our system, and was excluded. (c) The fraction of all sequenced human H3N2 strains with HAs that closely match HA1 sequences within our library from January 2022 to June 2024. The rolling means (+/-10 days) of total sequence counts and fraction of sequences are plotted, with strains matched in the library in blue and strains not represented by the library in green. A close match was defined as being within one amino acid mutation in the HA1 domain of HA. The HA1 domain encompasses sites where the majority of antigenic evolution takes place45,49,103106, but a similar analysis with full HA ectodomain sequences produces a qualitatively similar result (Supplemental Figure 2a).

Our library design successfully covered most of the diversity of human H3N2 HA in 2023, although by 2024 there were an increasing number of human H3N2 HA sequences not well represented by a strain in our library (Figure 2c and Supplemental Figure 2a). The HA protein sequence diversity among the 2023-circulating strains in the library was fairly low, reflecting the modest standing genetic diversity of H3N2 HA in 2023. Library strains were usually separated from their closest relative in the library by just a few HA amino-acid mutations (Supplemental Figure 2b-e). The greatest variation among the library strains was generally at sites in classically defined antigenic regions on the globular head of the HA protein (Supplemental Figure 2f,g).

We generated barcoded viruses for each of the 78 HA sequences we had chosen, making three independently barcoded viruses for each HA sequence (Figure 1). Note that all non-HA genes (including neuraminidase) for these viruses are from the lab-adapted A/WSN/1993 (H1N1) strain. To determine the relative titers of all the strains, we initially pooled them at equal volumes, infected cells, and quantified the relative frequency of each strain by barcode sequencing (Supplemental Figure 1d). We then re-pooled the strains based on these relative titers to generate a pooled library where each HA was at approximately equal frequencies (Supplemental Figure 3).

Neutralization titers to 2023 strains are heterogeneous across sera from children and adults

We measured neutralization titers of all 78 viral strains against a total of 150 serum samples from children and adults with diverse ages and vaccination histories (Table 1), representing 11,700 total titer measurements. The measured titers were highly reproducible both across different barcodes for the same HA and assays performed on different days (Supplemental Figure 4).

Overview of children and adult cohorts.

Summary of sera from the children and adult vaccination cohorts. Children sera were from routine hospital or clinic visit blood draws, and had limited information about vaccine and exposure histories pulled from electronic health records. Adult sera were collected through vaccine response studies based in the United States of America (USA) and Australia at various time points pre-vaccination and post-vaccination. Detailed metadata can be found at https://github.com/jbloomlab/flu_seqneut_H3N2_2023-2024/tree/main/data/sera_metadata

We first focus our analysis on the titers for sera collected in the second half of 2023 from 56 children (ages 1 to 14 years) and 39 pre-vaccination adults (ages 22 to 74 years). Most of these adults had been vaccinated in the prior year, while the vaccination histories of the children are mostly unknown. All these sera were collected from locations in the USA (Table 1). We reasoned that these sera likely provide a reasonable representation of population neutralizing antibody immunity at the beginning of the 2023-2024 Northern Hemisphere influenza season.

There was substantial variation in neutralization titers across sera, and in many cases also across different viral strains for the same serum. For example, Figure 3a shows neutralization titers for a child and adult serum sample against the 62 viruses that represented HA diversity in 2023. Each serum has reduced titers to certain groups of strains (Figure 3a). The child has reduced titers to strains with a mutation at site 145, while the adult is unaffected by mutations at site 145 but has decreased titers to strains with mutations at sites 275 and 276 (Figure 3a). Sites 145, 275, and 276 are all in known antigenic regions of HA45. Notably, mutations at sites 145 and 276 both subsequently increased in frequency among human H3N2 in 2024 (Supplemental Figure 5a-c), and the strain chosen for the 2024 Southern Hemisphere vaccine contains mutations at both these sites46.

Neutralization titers to 2023-circulating strains for children and adults vary among individuals and cohorts.

(a) Neutralization titer profiles for a child and a pre-vaccination adult, showing the titer of each serum against each of the 2023-circulating strains in the library. Neutralization by the child’s serum is reduced for strains with the S145N mutation in antigenic region A, while neutralization by the adult’s serum is reduced by multiple mutations within antigenic region C (sites 275 and 276). Strains are grouped phylogenetically on the X-axis. (b) Neutralization titer profiles across all individuals from the children and adult pre-vaccination cohorts. Each thin line is a neutralization titer profile for an individual serum. Each point represents the median neutralization titer across all sera for that strain.

Some of the heterogeneity apparent in these two example neutralization profiles (Figure 3a) is mirrored across the larger set of 56 children and 39 adult sera (Figure 3b). The most striking observation is the wide person-to-person variation in titers, which is especially apparent for the children sera (Supplemental Figure 5d,f). For instance, a few children sera neutralize all strains with titers >1,000, but most children sera titers are roughly an order of magnitude lower (Figure 3b). There is also person-to-person variation in titers across adult sera, although less so than for the children sera (Figure 3b, Supplemental Figure 5d-f). Nested within this person-to-person variation is strain-to-strain variation in titers for each serum. While the distribution of within-serum, strain-to-strain variation is not substantially different between age cohorts (Supplemental Figure 5g), for a few specific strains the strain-to-strain variation does segregate by age cohort. For instance, across the entire sera set, strains with mutations at site 145 have relatively lower titers than other strains for the children sera, but not for the adult sera (Figure 3). However, much of the person-to-person and strain-to-strain variation seems idiosyncratic to individual sera rather than age groups, since the heterogeneity persists even if the sera are analyzed in more fine-grained age groups (Supplemental Figure 5h).

Pooled sera fail to capture the heterogeneity of individual sera

One approach that has been used for influenza antigenic characterization is to pool sera from many different individuals and then measure titers against the serum pool3942. However, given the dramatic heterogeneity across sera (Figure 3), these pools might be expected to mostly just reflect the properties of the most potent sera in the pool rather than capturing the full heterogeneity.

To confirm that sera pools are not reflective of the full heterogeneity of their constituent sera, we created equal volume pools of the children and adult sera and measured the titers of these pools using the sequencing-based neutralization assay. As expected, neutralization titers of the pooled sera were always higher than the median across the individual constituent sera, and the pool titers against different viral strains were only modestly correlated with the median titers across individual sera (Figure 4). The differences in titers across strains were also compressed in the serum pools relative to the median across individual sera (Figure 4). The failure of the serum pools to capture the median titers of all the individual sera is especially dramatic for the children sera (Figure 4) because these sera are so heterogeneous in their individual titers (Figure 3b). Taken together, these results show that serum pools do not fully represent individual-level heterogeneity.

Pooled sera neutralization titer profiles do not capture nuances from individually-measured sera.

Correlations between titers measured from pooled sera and the median of individually-measured titers are plotted for (a) children, (b) pre-vaccination adults, and (c) children and pre-vaccination adults together. Each dot corresponds to the pooled or median titer for a given 2023-circulating library strain. (d) The full neutralization profiles for all children and pre-vaccination adults individually and as a serum pool, replotted from Figure 3b. Titers for individual sera are plotted as thin red lines, with the median titer across all children and adult sera indicated by red points, and the titer of the pooled serum indicated by a black line and points.

Evolutionary success of viral strains correlates with neutralization titers from individually-measured sera but not pooled sera

The most evolutionarily successful viral strains each human influenza season are those that jointly maximize both inherent transmissibility and evasion of pre-existing population immunity13,17,4750. Several studies have shown that experimental measurements of anti-HA antibody immunity (e.g., hemagglutination-inhibition or neutralization assays) can partially explain the evolutionary competition among human influenza strains, presumably by capturing some of the contribution of immune evasion to viral fitness13,17,4750. Our sequencing-based neutralization assay enabled us to measure human neutralizing antibody titers to a wide range of influenza strains on an unprecedented scale, so we sought to test whether these measurements correlated with the evolutionary success of these strains.

To estimate the evolutionary success of different human H3N2 influenza strains during 2023, we used multinomial logistic regression, which analyzes strain frequencies over time to calculate strain-specific relative growth rates5153. There were sufficient sequencing counts to reliably estimate growth rates in 2023 for 12 of the HAs for which we measured titers using our sequencing-based neutralization assay libraries (Figure 5a,b and Supplemental Figure 6a,b).

Individually-measured serum neutralization titers correlate with the growth rate of viral strains in the human population in 2023.

(a) Strain frequencies and model fits for the strains with sufficient sequencing counts to estimate a growth rate using multinomial logistic regression. Dots represent strain frequencies averaged across a 14-day sliding window, and lines represent the model fit. (b) Phylogenetic tree of the 12 viral strains with sufficient sequencing data to estimate their relative growth rates. Each strain is colored by its growth rate relative to the A/Massachusetts/18/2022 strain. (c) Correlation between the strain growth rates and serum neutralization titers from 95 children and pre-vaccination adults. The plot on the left shows the fraction of sera with titers below 138. The plot at right shows the titers of an equal volume pool of all sera. In both panels, each point corresponds to the growth rate and titer for one of the 12 viral strains. The numbers in the upper corners show the Pearson correlation coefficient R and the P-value as assessed by randomizing the experimental data among strains 200 times. (c) Correlation of strain growth rate with fraction of sera that have a titer below a given threshold for thresholds between 40 and 1000. The thick black line shows the actual correlation at each threshold, and the thin blue lines show the correlations for 200 randomizations of the experimental data among the strains. The threshold that gave the highest correlation is 138; none of the randomizations had a correlation as high as the actual data at any threshold so the P-value is < 0.005. (d) Correlation of strain growth rate and the number of HA1 amino-acid mutations relative to the common ancestor of the 12 strains with growth rate estimates. Correlations are similar in strength for full HA ectodomain amino-acid mutations and HA nucleotide mutations (Supplemental Figure 6l). (f) Correlations between the fraction of sera with titers below 138 and the number of HA1 amino-acid mutations for the 12 strains with estimated growth rates (left) and all 62 of the 2023-circulating strains in the library (right).

We compared the estimated strain-specific growth rates to the measured neutralization titers across the entire set of children and adult sera for each strain. Susceptibility to influenza is thought to increase once neutralization titers fall below a threshold3,17,37, so we correlated the growth rates to the fraction of sera with titers below a threshold against each strain, testing a range of thresholds. The growth rates were most correlated with the fraction of sera with neutralization titers below 138 (Figure 5c), and this correlation was highly statistically significant as assessed by the fact that it exceeded the correlation for 200 different random permutations of the experimental titer data (Figure 5d). This finding demonstrates that the evolutionary success of human H3N2 influenza strains in 2023 was highly correlated with experimental measurements of the strains’ ability to evade neutralizing antibody immunity across the human population. Similar correlations are observed if we quantify the per-strain neutralization titers in terms of the geometric mean or median of the neutralization titers rather than by the fraction of sera below a titer threshold (Supplemental Figure 6c).

The above correlations were for growth rates estimated for strains with at least 80 sequencing counts in 2023. We also repeated the above analysis using lower cutoffs for how many sequencing counts are needed to estimate a strain’s growth rate; these estimates include more strains but are noisier. For these lower cutoffs, there remained a good although somewhat weaker correlation between growth rate and the fraction of sera below a titer threshold (Supplemental Figure 6d).

However, there was no correlation between the growth rates of the strains and their neutralization titers against the pooled children and adult sera (Figure 5c), emphasizing the importance of accounting for population heterogeneity. Broadly similar trends held if the analyses were done using neutralization titers for only the children or only the adult sera. Specifically, there were strong correlations between strain growth rate and fraction of sera with titers below a threshold against each strain when children and adult sera were analyzed individually, although the correlations were not quite as strong as for the combined children and adult sera set (Supplemental Figure 6e-j). But growth rates were at best modestly correlated with titers against pools of only children or only adult sera (Supplemental Figure 6e-j). The reduced disparity between the full sera set versus serum pool correlations for the children-only and adult-only sera sets compared to the full combined sera set could be because the reduced heterogeneity within the children-only and adult-only sets make the pools a better (although still imperfect) proxy for these sera sets versus the full combined sera set (Figure 4a-c and Supplemental Figure 6k).

The strong correlation between strain growth and the fraction of sera with titers below a threshold is highly significant as assessed by randomizing the experimental data among strains (Figure 5c,d). However, an important caveat is that the strains are phylogenetically related and growth rate may segregate with phylogeny (Figure 5b), and some strains share HA1 mutations due to common descent (Table 2). Accurately assessing the significance of relationships involving variables with phylogenetic structure is notoriously challenging from a statistical perspective54. To assess whether phylogenetic structure may contribute, we examined the correlation of strain growth and the number of HA1 amino-acid mutations relative to the common ancestor of all strains with growth estimates. This correlation was also strong and significant (Figure 5d and Supplemental Figure 6l), an expected result because the neutralization titers correlate with the number of HA1 mutations for the strains with sufficient sequences to estimate growth rates, although they do not correlate for the set of all strains (Figure 5e). A permutation feature importance analysis in the context of multiple linear regression indicated that the fraction of individuals with low neutralization titers are more important for explaining strain growth than the HA1 mutation count (68% importance for titers versus 32% for HA1 mutation counts), but the variables are so co-linear that they cannot be convincingly separated. Therefore, with the current data it is not possible to be confident that the experimentally measured titers provide substantially more information about the evolutionary success of viral strains than simple phylogenetic methods such as counting mutations.

HA1 mutations present in strains with estimated growth rates.

HA1 amino-acid mutations for each of the 12 strains with sufficient sequence counts to estimate growth rates. Mutations in known antigenic regions as defined by Muñoz and Deem45 are in bold text, and mutations outside of antigenic regions are in regular weight text. The table lists all HA1 mutations relative to the most-recent common ancestor of the 12 strains.

Patterns of neutralization of the past decade of vaccine strains stratify largely by birth cohort

We examined how well each serum neutralized viruses with HAs from each of the H3N2 influenza Northern Hemisphere vaccine strains from the last decade (2014 to 2024, see Supplemental Table 1). These historical vaccine strains cover a much wider span of evolutionary diversity that the 2023-circulating strains analyzed in the preceding sections (Figure 2a,b and Supplemental Figure 2b-e). For this analysis, we focused on the cell-passaged strains for each vaccine, as these are more antigenically similar to their contemporary circulating strains than the egg-passaged vaccine strains since they lack the mutations that arise during growth of viruses in eggs5557 (Supplemental Table 1).

The patterns of neutralization of the vaccine strains can mostly be rationalized in terms of exposure history. Children born between 2021 and 2023 generally have either no measurable titer to any strain or their highest titers to recent vaccine strains (Figure 6), presumably because these young children were either not yet exposed to H3N2 influenza or were exposed to a recent strain. Children born between 2015 and 2020 often neutralize the past decade of vaccine strains (Figure 6), consistent with the fact that children of this age have typically been exposed to H3N2 influenza19,58. However, the strain that is best neutralized by children born between 2015 and 2020 varies, potentially because different children in this age group were first infected by different viruses over the last decade. Older children born between 2009 and 2014 typically best neutralize the oldest vaccine strain tested (Figure 6), consistent with numerous studies showing that an individual’s neutralizing titers are highest to the influenza strains to which they were first exposed19,22,24,27,5860. Adults across age groups also have their highest titers to the oldest vaccine strain tested (Figure 6), consistent with the fact that these adults were first imprinted by exposure to an older strain.

Neutralization titers to the past decade of vaccine strains.

Neutralization titer profiles to the past decade of cell-passaged Northern Hemisphere vaccine strains across all individuals from the children and adult pre-vaccination cohorts, stratified by age group as indicated in plot panel titles. Strains are ordered on the x-axis by the year they were collected. Each thin line is a neutralization titer profile for an individual serum, and each colored point represents the median titer across all sera for that strain. See Supplemental Table 1 for details on which strains were in the vaccines for which seasons.

Vaccination of adults broadly increases neutralization titers to most strains

We examined the impact of vaccination on neutralization titers against the full set of strains in our H3N2 library. To do this, we analyzed matched 28-day post-vaccination samples for each of the above-described 39 pre-vaccination samples from the cohort of adults based in the USA (Table 1). We also analyzed a smaller set of matched pre-and post-vaccination sera samples from a cohort of eight adults based in Australia (Table 1). Note that there are several differences between these cohorts: the USA-based cohort received the 2023-2024 Northern Hemisphere egg-grown vaccine whereas the Australia-based cohort received the 2024 Southern Hemisphere cell-grown vaccine, and most individuals in the USA-based cohort had also been vaccinated in the prior season whereas most individuals in the Australia-based cohort had not. Therefore, multiple factors could contribute to observed differences in vaccine response between the cohorts.

For both cohorts, vaccination broadly increased neutralization titers to all 2023-circulating strains. For the USA-based cohort, vaccination typically increased titers to the 2023-circulating strains by ∼2–4-fold whereas for the Australia-based cohort the increases were typically ∼5–10-fold (Figure 7a). Due to the multiple differences between cohorts we are unable to confidently ascribe a cause to these differences in magnitude of vaccine response.

Impact of vaccination on neutralization titers.

(a) Neutralization titers pre- and post-vaccination for the USA-based adult cohort (top) and the Australia-based adult cohort (bottom) against the 2023-circulating library strains. Points indicate the median titers across participants and the shaded regions show the interquartile range. The H3N2 component strain of the 2024 Southern Hemisphere seasonal influenza vaccine (A/Massachusetts/18/2022) is indicated by bold text and an asterisk; this strain circulated in 2023 and so was classified as both a 2023-circulating strain and a vaccine strain in our analysis. (b) Neutralization titers pre- and post-vaccination for the two cohorts to each cell-passaged or egg-passaged Northern Hemisphere H3N2 vaccine strain from the past decade (Supplemental Table 1). Viruses are listed on the x-axis in order of increasing year in which they were isolated. The USA-based cohort received the 2023-2024 Northern Hemisphere seasonal influenza vaccine (A/Darwin/9/2021), and the Australia-based cohort received the 2024 Southern Hemisphere seasonal influenza vaccine (A/Massachusetts/18/2022).

Vaccination also generally increased titers to the last decade of both cell- and egg-passaged vaccine strains, although the increases tended to be largest to the most recent vaccine strains (Figure 7b). We therefore see modest evidence for the previously described phenomenon of back-boosting27, where vaccination with a recent strain increases titers to both recent and older strains. However, the extent of back-boosting is quite small for the oldest vaccine strains. We also found much higher titers against egg-passaged strains relative to the cell-passaged strains of the same year in both the USA-based and Australia-based cohorts, both pre- and post-vaccination (Figure 7b). The higher titers to the egg-passaged versus cell-passaged vaccine strains could be due to antigenic differences between the strains, or due to differences in receptor avidity that could make egg-passaged strains inherently easier to neutralize6165.

Discussion

We have used a sequencing-based assay43 to experimentally measure neutralization titers for sera from humans of different ages against a large set of H3N2 viral strains. There was substantial person-to-person variation in titers to different H3N2 strains that circulated in 2023. The highest neutralization titers were for a subset of the sera from children, which in some cases could neutralize all 2023-circulating H3N2 strains over an order of magnitude better than the median titer across all sera. The high titers of some children sera are consistent with the idea that neutralizing antibody responses are highest to strains encountered in the first decades of life24,27, although we also note that children are more frequently infected66 and could therefore be more likely to have recent immunologic boosting67,68. However, there were also many children with low to moderate titers, and overall titers were variable both within and between age groups.

We identified many examples of individual sera that had lower titers to specific 2023-circulating H3N2 strains, including strains with mutations at key sites of HA evolution in 2023-2024, such as 145 and 276. However, although there is clear heterogeneity among individuals in their neutralization of 2023-circulating H3N2 strains, only a modest fraction of this heterogeneity can be explained by birth cohort. Therefore, although birth year certainly contributes to strain-to-strain heterogeneity in viral neutralization19,68,29, among recent strains there is heterogeneity both between and within birth cohorts. Age does better explain the variation in titers across the past decade of vaccine strains, with older individuals typically better neutralizing older strains, consistent with prior work70,24,60,22. Therefore, our results suggest that while age explains much of the person-to-person variation in neutralization of highly diverse strains spanning a decade or more of viral evolution, the more fine-grained variation in neutralization of the strains present in a single season can be highly idiosyncratic even within an age group.

The actual evolutionary success (growth rates) of different H3N2 influenza strains in 2023 was highly correlated with the fraction of adult and children sera with low titers to each strain. This finding supports the idea17 that evolutionarily successful strains are those that evade the pre-existing immunity of the largest fraction of the human population. In our study we simply correlated strain growth rates with the titers of a large set of children and adult sera, but future work could try to weight sera to capture the relative importance of age groups in shaping viral transmission at the population level17,29,68,7173. Notably, the neutralization titers of serum pools correlated poorly both with the median titers of the individually-measured constituent sera and with the evolutionary success of different viral strains. Therefore, serum pools may be a poor proxy for the heterogeneous population immunity that shapes influenza evolution.

High-throughput sequencing-based neutralization assays like the ones described in this paper could be performed on a timeline that would help inform influenza vaccine-strain selection. Vaccine strains are selected at bi-annual meetings based on phenotypic and antigenic characterization of viruses circulating around the time of each meeting74. Our 78 strain H3N2 influenza library was designed and created in ∼12 weeks. With this library in-hand, a single graduate student (the lead author of this paper) made ∼10,000 neutralization measurements with about four cumulative weeks of experimental work. For this initial study, the subsequent data analysis took longer than the experiments, but that analysis could be greatly streamlined now that the computational pipelines are in place. We therefore suggest that libraries designed in the Northern Hemisphere fall could be used to generate data for strain selection in the following spring. The resulting data could help improve vaccine efficacy, which has sometimes been low in part due to antigenic mismatch between selected vaccine strains and circulating strains75,76. For instance, our experiments identified J.2 subclade strains such as A/Ontario/RV00796/2023 as among those that escaped the largest fraction of children and adults; such strains would have been a good antigenic match in the 2023-2024 season when the J.2 subclade predominated46. Therefore, sequencing-based neutralization measurements could provide valuable new data that could be incorporated into influenza forecasting models4749,7780 that also include other factors that shape viral evolution.

Limitations of the study

While our study analyzed sera from a large number of different individuals, these sera were from just a few sources: residual children’s sera from a hospital in Seattle, and adult sera from heavily vaccinated cohorts in Philadelphia and Australia. While H3N2 influenza strains often circulate globally81,82, there can be modest geographic stratification of strains that could lead to sera from these locations not being fully representative of global population immunity. And while natural infection plays a stronger role in shaping immunity than vaccination6,83, there could be some differences between the highly vaccinated adults in our study and the unvaccinated population that is more common globally. Additionally, some age groups are lacking from our study, and the number of sera from individuals of different ages may not be proportional to their importance in shaping influenza virus evolution.

Our experiments only measured neutralizing antibodies targeted to HA. While such antibodies are the best-established correlate of protection3,711, non-neutralizing antibodies to HA and anti-neuraminidase antibodies (which are generally not neutralizing in single-round infections like the ones used in our assay84) can also contribute to protection2,85.

Our comparisons of the neutralization titers to the growth rates of different H3N2 strains was limited by the fact that only a modest number of strains had adequate sequence data to estimate their growth rates. Strains with more sequencing counts tend to be those with moderate-to-high fitness, which therefore limited the dynamic range of growth rates across strains we were able to analyze. Additionally, because these strains are phylogenetically related it is difficult to assess the statistical significance of the correlation54, so it will be important for future work to reassess the correlations with new neutralization data against the dominant strains in future years.

Methods

Data and code availability

All data and code are publicly available at the following links:

Human sera

Human sera were from individuals of different ages from Seattle Children’s Hospital and two adult vaccine cohorts, one based in Philadelphia, United States of America, and the other based in Australia (Table 1). The Seattle Children’s Hospital sera were obtained from routine blood draws from children (ages 1-14) receiving medical care in December 2023, which was approved by the Seattle Children’s Hospital Institutional Review Board with a waiver of consent. Limited, non-identifying information was obtained from electronic health records. The sera from an adult vaccine cohort based in Philadelphia were taken from adults (ages 22-74) on the day of and 28 days post vaccination with a 2023-2024 Northern Hemisphere egg-based vaccine (FluLaval quadrivalent influenza virus vaccine from GlaxoSmithKline) between October-December 2023. This study was approved by the Institutional Review Board of the University of Pennsylvania under protocol number 849398. The sera from Australia-based adult vaccine cohort were taken from adults (ages 26-54) on the day of and 17-21 days post vaccination with a 2024 Southern Hemisphere cell based vaccine (Flucelvax quadrivalent influenza vaccine from CSL Seqirus) between April-May 2024.

Before use in sequencing-based neutralization assays, all sera were treated with receptor-destroying enzyme and heat-inactivated using a protocol described in Lee et al.30 Briefly, one vial of lyophilized receptor-destroying enzyme II (Seikan) was resuspended in 20 mL PBS and passed through a 0.22 uM filter. Then, 100 uL of each sera was incubated with 300 uL of receptor-destroying enzyme (constituting a 1:4 dilution) at 37°C for 2.5 hours and then 55°C for 30 minutes. Sera were then used immediately or stored at -80°C until use.

Selection of H3N2 strains for a library for sequencing-based neutralization assays

To identify representative circulating strains, we used H3N2 Nextstrain86,87 builds available in November 2023. These trees are subset by Nextstrain-defined clade, subclade and derived haplotype, where a derived haplotype is a more fine-grained level of genetic subdivision than subclade, and is defined as a subset of strains belonging to the same subclade that each share additional amino acid mutation(s) and have achieved some threshold of child strains. We filtered all derived haplotypes by collection date, retaining only those haplotypes with a strain sequenced within a 12-month time period of library design. We also selected additional haplotypes that had not yet achieved the given threshold of child strains to be defined as derived haplotypes, but were sequenced at high frequency within a 12-month window. For each of the derived haplotypes we selected a naturally-occurring HA sequence with the lowest divergence from the derived haplotype consensus sequence. This yielded 62 strains, representing the circulating diversity of H3N2 HA in 2023. Additionally, the past decade of egg- and cell-passaged vaccine strains were added, constituting an additional 16 strains. To visualize the 78 strains contained in the library, we built interactive trees on a background of the Nextstrain 2-year tree available in November 2023 (https://nextstrain.org/groups/jbloomlab/seqneut/h3n2-ha-2023-2024). We also drew static versions of this tree for paper figures, as described below. Note that we use the standard H3 ectodomain numbering for all these strains.

Analysis of library composition

To obtain the count and fraction of HA1 and HA ectodomain sequences matching circulating sequences (depicted in Figure 2c and Supplemental Figure 2a), all available H3 HA sequences between June 2022 and June 2024 were downloaded from the GISAID88 EpiFlu database. Each sequence was counted as a match if that given sequence was within 1 amino-acid mutation of a library HA1 or HA ectodomain sequence. The fraction matching was then obtained by dividing the count of matching sequences by all total sequences. Strain counts can be found here (https://github.com/jbloomlab/flu_H3_2023_seqneut_vs_growth/tree/main/results/strain_counts) and code for these analyses are here (https://github.com/jbloomlab/flu_H3_2023_seqneut_vs_growth)).

All trees (in Figure 2, Figure 3, Figure 5 and Supplemental Figure 5) were drawn using the baltic Python package (https://github.com/evogytis/baltic) and custom code (https://github.com/jbloomlab/flu_seqneut_H3N2_2023-2024/tree/main/non-pipeline_analyses/library_design/plasmids).

Design of an H3 HA barcoded construct

Influenza has a single stranded negative sense segmented RNA genome. For proper packaging of the viral genomic segments, there must be intact “packaging signals” both upstream and downstream of the viral genome, encompassing both coding and non-coding regions89,90 (Supplemental Figure 1a). To modify the influenza HA segment to incorporate a unique 16-nucleotide barcode without disrupting coding sequence and maintaining proper packaging, we inserted the barcode after the stop codon at the end of the HA coding sequence and duplicated the packaging signal at the 5’ end of the negative sense HA vRNA, as previously described29,43,44,89,9194 (Supplemental Figure 1b). The HA coding region itself was chimeric, with the 3’ signal peptide and 5’ C-terminal tail using the protein sequence of laboratory-adapted A/WSN/1933 (H1N1) sequence, and the HA ectodomain and transmembrane domain was derived from an H3 sequence. While the HA ectodomain region varied across strains in our library, the transmembrane domain was fixed to an H3 consensus sequence. Additionally, the last 105 nucleotides of the HA coding sequence were synonymously recoded to reduce homology between this region and the duplicated packaging signal (see Supplemental Figure 1b legend for more details), thus improving barcode retention43. This strategy of using A/WSN/1933 (H1N1) packaging signals is thought to improve incorporation of HAs, as all other non-HA genes were from A/WSN/1933 (H1N1)92,94. We used A/WSN/1933 (H1N1) non-HA genes because this lab-adapted strain produces viral stocks from reverse genetics at high titers, and because the extensive laboratory passaging reduces its biosafety risk.

Cloning a H3 HA barcoded plasmid library

We applied a cloning strategy similar to that described in Loes et al.43 with modifications for the H3 HA construct described above. Briefly, each of the selected H3 HA ectodomains were ordered from Twist Biosciences encoding sequence homology with upstream A/WSN/1933 (H1N1) HA signal peptide and transmembrane coding region. A separate barcoded fragment was generated by PCR encoding sequence homology with the HA transmembrane and C-terminal tail coding region and an Illumina Read1 sequence that immediately follows the incorporated barcode. These fragments were assembled in a three-segment assembly reaction with a plasmid backbone using HiFi Assembly Mastermix (NEB). The plasmid backbone was the same derivative of a pHH21 plasmid95 described in Welsh et al.29, which contains an upstream A/WSN/1933 (H1N1) signal peptide, a downstream A/WSN/1933 (H1N1) partial endodomain and C-terminal domain, the 3’ and 5’ untranslated regions, the pHH21 promoter and terminator29, and the restriction enzyme sites XbaI and BsmBI (after the upstream and before the downstream packaging signals, respectively). The plasmid backbone was digested with restriction enzymes XbaI and BsmBI (NEB) per manufacturer’s instructions prior to the three-segment assembly. Each HA was cloned with 3 barcodes per strain and were verified by whole plasmid sequencing through Plasmidsaurus. The barcodes linked to each strain are listed here (https://github.com/jbloomlab/flu_seqneut_H3N2_2023-2024/blob/main/data/viral_libraries/2023_H3N2_Kikawa.csv), and all HA variant plasmid maps in GenBank format are available here (https://github.com/jbloomlab/flu_seqneut_H3N2_2023-2024/tree/main/non-pipeline_analyses/library_design/plasmids).

Generating a barcoded influenza virus library

As in Loes et al.43, we used previously-described reverse genetics approaches to generate viruses for all 3 barcoded HA plasmids for each influenza strain. Briefly, co-cultures of 5e5 293T cells and 5e4 MDCK-SIAT1-TMPRSS296 cells were seeded in each well of a 6-well plate in D10 media (DMEM supplemented with 10% heat-inactivated fetal bovine serum, 2 mM L-glutamine, 100 U per mL penicillin and 100 ug per mL streptomycin). For each well, a master mix was made up of 250 ng of each non-HA segment (PB1, PB2, PA, NA, M, NP, NS) from A/WSN/1933 (H1N1) using WSN pHW18 plasmids97 and a given strain’s HA plasmid pool (containing all 3 of the independently barcoded strains). At ∼24 hours after cell seeding, this plasmid mastermix was mixed with 100 uL of DMEM and 3 uL of BioT reagent, incubated for 15 minutes at room temperature, and then carefully added dropwise to wells. At 20 hours post-transfection, media was removed, cells were washed with PBS, and then 2 mL of influenza growth media (Opti-MEM supplemented with 0.1% heat-inactivated FBS, 0.3% bovine serum albumin, 100 ug per mL of calcium chloride, 100 U per mL penicillin and 100 ug per mL streptomycin) was added. At 45 hours post media swap (∼65 hours post transfection), supernatant was removed and centrifuged at 2000 rpm for 5 minutes to remove cell debris before being aliquoted for storage at -80°C. Before use, 100 uL of each of these virus stocks were passaged once more in 2 mL of influenza growth media, each in a single well of a 6-well plate containing 4e5 MDCK-SIAT1-TMPRSS2 cells96 for ∼40 hours, as described in Loes et al.43 Supernatants were again cleared of cell debris by centrifugation at 2000 rpm for 5 minutes before being stored at -80°C.

Assessing and correcting pool balancing

We again followed a similar approach to that outline in Loes et al.43 to library pooling and balancing. Briefly, to generate a pool of HA strains with relatively equal strain balancing, we first created an equal volume pool of all barcoded HA strains. This equal-volume library was serially 2-fold diluted and used to infect MDCK-SIAT1 cells seeded at 5e4 cells per well in all wells of a 96-well plate. At ∼16 hours post infection, viral barcodes were sequenced as described below. Sequencing counts were then used to determine the relative contribution of each HA strain to the virus pool (Supplemental Figure 1d). We then calculated the relative amount of each viral strain so that each strain’s barcodes should correspond roughly to 1/78 of the sequencing counts for the total pool, and re-pooled variants accordingly (Supplemental Figure 1d). This re-pooled library was again serially diluted over MDCK-SIAT1 cells seeded at 1.5e5 cells per well in all wells of a 96-well plate, and viral barcodes were sequenced and used to verify each strain (and each strains constituent barcodes) was present at relatively equal amounts (Supplemental Figure 3e).

Determining infection conditions for sequencing-based neutralization assays

The virus library dilution and cell density used in our sequencing-based neutralization assays needed to maintain viral barcode counts in the linear range where viral transcription scales linearly with the amount of infectious virus added on cells, while maximizing the number of infectious virus particles added to each well (Supplemental Figure 3a,b). As the H3 virus library was nearly double the size of the library described in Loes et al.43, we could not assume the same exact virus dilutions and cell densities would satisfy these requirements. Therefore, to identify the virus dilution and cell density conditions for the H3 library, we infected several cell densities (5e4 cells per well, 1e5 cells per well and 1.5e5 cells per well) of MDCK-SIAT1 cells plated in influenza growth media (Supplemental Figure 3a) with serially diluted library. At ∼16 hours post infection, we isolated and sequenced viral barcodes as described below. This approach allowed us to determine the transcriptional activity per uL of the library, rather than use a metric of viral infection (e.g., TCID50 per uL), which would be less relevant for an assay like ours that relies on vRNA for the main readout. We chose the 1:32 dilution of the virus library on the maximum density of cells tested, 1.5e5 MDCK-SIAT1 cells per well of a 96-well plate (Supplemental Figure 3b,d) as it satisfied both of these outlined conditions.

Sequencing-based neutralization assays

The experimental setup is nearly identical to that outlined in Loes et al.43 A few modifications to that general protocol were made for the increased library size, and are noted accordingly. An updated protocol including these modifications is available here (dx.doi.org/10.17504/protocols.io.ewov1962klr2/v1). The initial steps of sequencing-based neutralization assays are similar to any 96-well plate neutralization assay. Human sera was prepared by treating with receptor-destroying enzyme and heat inactivation to prevent potentially confounding interactions with sialic acids contained in human serum as described above and previously30. Using an initial dilution of 1:40 (accounting for the 1:4 dilution that was used during the receptor-destroying enzyme treatment), sera are then serially 2-fold diluted across 11 columns of a 96-well plate in influenza growth media for a final volume of 50 uL diluted serum in each well. The final column of the 96-well plate was used for a no-serum control, and contained only 50 uL of influenza growth media. A 50 uL volume of diluted virus library was then added to each well, and virus-serum mixtures were incubated at 37°C with 5% CO2 for 1 hour. Following this incubation, 1.5e5 MDCK-SIAT1 cells were added per well in a total of 50 uL of influenza growth media and then incubated at the same conditions for ∼16 hours. Here, and in the virus library titration experiments described above, MDCK-SIAT1 cells are used because they lack TMPRSS2 (which cleaves HA in producing cells, freeing and activating nascent virus particles for infection), and therefore should have only limited secondary viral replication98100. At ∼16 hours post infection, cells were lysed and barcodes were sequenced as described below.

RNA extraction and barcode sequencing

Our sequencing protocol is similar to that described in Loes et al., except for one difference in the first round of PCR, which we note below. Briefly, supernatant is removed and cells are gently washed with PBS before each well is lysed with 50 uL of iScript RT-qPCR Sample Preparation Reagent (BioRad)101 containing the barcoded RNA spike-in (which was generated, purified and quantified as described in Loes et al.43) at 2 pM per well. The lysis reaction is allowed to proceed for 5 minutes before lysate is transferred off cells and into a new 96-well plate. We then use 1 uL of this lysate in 10 uL cDNA synthesis reactions for all wells using the iScript cDNA Synthesis kit (BioRad) according to the manufacturer’s instructions, using the same gene-specific primer as in Loes et al.43.

This cDNA product was then amplified in two rounds of PCR. In the first round, some sequencing preparations used the exact primers described previously43. Other sequencing preparations used slightly modified forward primers which included a 6-bp index paired with the same reverse primer described in Loes et al.43 (5′-AGCAAAAGCAGGGGAAAATAAAAACAACC-3′). The set of forward primers was as follows:

  • 5’-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTgctacaCCTACAATGTCGGATTTGTATTTAA TAG-3’

  • 5’-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTatcgatCCTACAATGTCGGATTTGTATTTAAT AG-3’

  • 5’-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTtgacgcCCTACAATGTCGGATTTGTATTTAA TAG-3’

  • 5’-GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTcagttgCCTACAATGTCGGATTTGTATTTAAT AG-3’)

This allowed us to multiplex different plates using the same dual indices (added during the second round of PCR), which helped decrease sequencing costs. Regardless of which round one primer set was used, we used 5 uL of cDNA as template in a 50 uL reaction using KOD Polymerase Hot Start 2x Mastermix (Sigma) according to the manufacturer’s instructions. The second round of PCR adding unique dual indexing primers was performed exactly as described in Loes et al.43, and after PCR samples were pooled at equal volume and ran on a 1% agarose gel at 85V for 40 minutes. The bands were extracted and purified using Nucleospin Gel Extraction Kit (Takara) before being purified with Ampure XP beads (Beckman Coulter) and quantified by Qubit dsDNA High Sensitivity Kit (Thermo Scientific). The indexed, purified and quantified DNA was then diluted to 4 nM and submitted for Illumina Sequencing, aiming for roughly 5e5-1e6 reads per well.

Analysis of sequencing-based neutralization data

As in Loes et al., the analysis of sequencing-based neutralization data was performed using the modular seqneut-pipeline v3.1.3 (https://github.com/jbloomlab/seqneut-pipeline) which calculates fraction infectivities from normalized barcode counts (as summarized in Figure 1), and then fits Hill curves to fraction infectivity values using the Python package neutcurve (https://github.com/jbloomlab/neutcurve). The processing of these data are described fully in Loes et al.43 and in the seqneut-pipeline README (https://github.com/jbloomlab/seqneut-pipeline/README.md). All neutralization titers and custom visualization code are available at https://github.com/jbloomlab/flu_seqneut_H3N2_2023-2024.

Modeling strain growth

The analysis performing H3 HA strain growth rate estimates using the evofr51 package is at https://github.com/jbloomlab/flu_H3_2023_seqneut_vs_growth. Briefly, we sought to make growth rate estimates for the strains in 2023 since this was the same timeframe when the sera were collected. To achieve this, we downloaded all publicly-available H3N2 sequences from the GISAID88 EpiFlu database, filtering to only those sequences that closely matched a library HA1 sequence (within one HA1 amino-acid mutation) and were collected between January 2023 and December 2023. If a sequence was within one HA1 amino-acid mutation of multiple library HA1 proteins then it was assigned to the closest one; if there were multiple equally close matches then it was assigned fractionally to each match. We only made growth rate estimates for library strains with at least 80 sequencing counts (Supplemental Figure 6a), and ignored counts for sequences that did not match a library strain (equivalent results were obtained if we instead fit a growth rate for these sequences as an “other” category). We then fit multinomial logistic regression models using the evofr51 package to the strain counts. For the plot in Supplemental Figure 5c the frequencies are averaged over a 14-day sliding window for visual clarity, but the fits were to the raw sequencing counts. For most of the analyses in this paper we used models based on requiring 80 sequencing counts to make an estimate for strain growth rates, and counting a sequence as a match if it was within one amino-acid mutation; see https://jbloomlab.github.io/flu_H3_2023_seqneut_vs_growth/ for comparable analyses using different reasonable sequence count cutoffs (e.g., 60, 50, 40 and 30, as depicted in Supplemental Figure 6). Across sequence cutoffs, we found that the fraction of individuals with low neutralization titers and number of HA1 mutations correlated strongly with these MLR-estimated strain growth rates.

To determine which of the predictors (neutralization titers or HA1 mutations) most fully explains the dependent outcome variable (growth rate), we performed a multiple regression. However, the fraction of individuals with low neutralization titers and number of HA1 mutations were moderately collinear with a variance inflation factor of 2.7, and therefore the regression coefficients could not be interpreted as their relative contribution. To circumvent this, we assessed the importance of HA1 mutations versus neutralization titers by dropping each variable from the regression and then calculating the resulting difference in the regression mean squared error across 100 permutations of a leave-one-out multiple regression. Variables with a larger effect on mean squared error are assigned a greater importance (which is equal to the permutation mean squared error minus the base mean squared error). The relative importance of each variable was then found by summing each variable importance across 100 permutations, and then dividing each variables’ importance by the total importances from both variables.. The notebook for the multiple regression and associated analyses can be explored here (https://github.com/jbloomlab/flu_H3_2023_seqneut_vs_growth/tree/main/non-pipeline_notebooks).

Supplemental tables

Cell-passaged and egg-passaged H3N2 strains chosen for the seasonal human influenza vaccine each season 2014-2024.

For each Northern Hemisphere seasonal influenza vaccine from 2014-2024, we list the cell- and egg-based vaccine strains and the GenBank and GISAID identification numbers linked to each sequence. We also note the amino-acid mutations present in each egg-passaged strain relative to its cell-passaged counterpart from the same vaccine season. Note the egg-passaged virus for the 2019-2020 season is listed (A/Kansas/14/2017X-327), but this strain did not grow to sufficiently high titers to be compatible with our assay and was excluded from the library.

Supplemental figures

Design of chimeric barcoded HA construct and virus library generation.

All HA gene segments in panels a-c are shown so that the encoded HA protein is oriented N-terminal to C-terminal, which is the 3’ to 5’ orientation of the negative-sense influenza viral genome. Stop codons are indicated with asterisks. (a) A schematic of the unmodified A/WSN/1993(H1N1) HA gene segment and partial upstream and downstream segments of the bidirectional influenza reverse-genetics plasmid97. The segment-specific regions of vRNA required for proper packaging of genome segments into nascent virions, or “packaging signals,” are denoted as red lines spanning both the untranslated regions (UTRs) and HA coding sequence, which includes the signal peptide, ectodomain, transmembrane (TM) and C-terminal (CT) regions. Outside of the HA vRNA-encoding regions, black lines show the upstream pol II promoter (for transcription of viral mRNA) and the downstream pol I promoter (for transcription of negative sense vRNA to be packaged in virus particles) that are found in the bidirectional influenza reverse genetics plasmid97; note these are not part of the actual viral RNAs. (b) A schematic of the chimeric H3 HA barcoded construct, similar to those in Loes et al.43 and Welsh et al.29 Where the HA gene coding sequence exactly match A/WSN/1993(H1N1) (at the upstream HA signal peptide and in most of the duplicated packaging signal) are colored as in panel a. The remainder of the downstream portions of the construct have changed, and are re-colored accordingly. The HA ectodomain is replaced with H3 library HA ectodomain sequences. The downstream TM domain is fixed to an H3 HA consensus sequence. The CT matches the A/WSN/1993(H1N1) protein sequence. Both the H3 TM and the A/WSN/1993(H1N1) CT are synonymously recoded to avoid complementation with the downstream duplicated packaging signal. The stop codon at the end of the HA coding region is duplicated to prevent polymerase read-through. A 16-nucleotide barcode segment follows these dual HA gene stop codons, followed by the Illumina R1 sequence necessary for preparing barcoded amplicons for sequencing. Finally, to preserve proper packaging of the entire gene segment, there is a duplicated packaging signal from A/WSN/1993(H1N1), including the partial TM, CT and downstream UTR regions. As in the constructs described in Loes et al.43, an additional stop codon is engineered that will be in-frame if this duplicated packaging signal replaces the partial packaging signal in the HA coding region; therefore, any HA gene segments that lose the barcoded segment should generate a truncated, non-functional HA protein. (c) A schematic of the construct from which the barcoded RNA spike-in are generated, identical to that in Loes et al. This construct is designed to produce barcoded RNA molecules identical to the barcoded HA segment described in panel b, but with a GFP sequence in place of the HA ectodomain sequence. As in b, the TM and CT domain are from A/WSN/1993(H1N1), and are synonymously recoded in the same manner to provide an identical priming region as incorporated in the HA construct. (d) A schematic of viral library generation, identical to that described in Loes et al.43 To express each HA variant on the surface of influenza virus particles, we use a reverse genetics protocol97 in which influenza virus gene segments are each expressed on separate plasmids and transfected into a co-culture of 293T cells and Madin-Darby Canine Kidney (MDCK) cells overexpressing 2,6-sialyltransferase (SIAT1) and transmembrane protease serine 2 (TMPRSS2). The 3 sequence-confirmed barcoded HA variants per HA sequence are pooled at this stage, and all other non-HA segments are from the lab-adapted H1N1 strain A/WSN/1993. Any changes in neutralization potency can therefore be attributed to an antigenic change within HA. Each per-strain pool of HA barcoded viruses is then passaged on MDCK-SIAT1-TMPRSS2 cells to reduce plasmid carry-over and improve viral titers. After this step, the barcoded HA strains are pooled to create a barcoded HA variant library, which can then be sequenced to assess HA strain balancing. Then, using this information to more equally represent strains, the final, balanced pool is generated.

HA diversity among the strains in the library.

(a) Count and fraction of all sequenced human H3N2 with HAs that closely match HA ectodomain protein sequences in our library. A close match is defined as being within one amino acid mutation in the HA ectodomain. This plot differs from Figure 2a by showing matches for the full HA ectodomain rather than just HA1. (b) Heatmap of pairwise HA1 amino acid sequence Hamming distances between all vaccine strains in the library. (c) Distribution of pairwise HA1 amino acid sequence Hamming distances between all 2023-circulating strains in the library. (d) The shortest HA1 amino acid sequence Hamming distance between each 2023-circulating strain in the library and another 2023-circulating strain. (e) HA1 amino acid sequence Hamming distance between each 2023-circulating strain in the library and the cell-passaged H3 component of the 2024–2025 seasonal influenza vaccine (A/Massachusetts/18/2022). (f) H3 HA trimer with antigenic regions (as defined by Munoz and Deem45) colored and labeled. The H3 HA structure is from A/Victoria/361/2011 (PDB: 4O5N107). (g) Site-wise Shannon entropy calculated from HA ectodomain sequences from all 2023-circulating strains and vaccine strains, mapped onto the same H3 HA trimer as in f. Most high-entropy sites fall within previously defined antigenic regions.

Determining and validating the cell density and viral library concentration for sequencing-based neutralization assays.

(a) To optimize the cell density and virus library dilution for the sequencing-based neutralization assay, we infected varying numbers of cells with serial dilutions of the virus library. While Loes et al.43 used 5e4 cells per well, the assay’s performance at higher cell concentrations was unknown. During the lysis step, we spiked in barcoded RNAs (as shown in Figure 1) to quantify the read counts at each virus library dilution across the tested cell concentrations. (b) The optimal conditions (indicated by the red circle) are those that maximize the number of virus particles added to each well while remaining in the linear range where transcriptional output scales linearly with the amount of infectious virus. By maximizing the number of infectious virus particles added to each well we reduce statistical noise due to bottlenecking of the library. Remaining in the linear range is crucial for several reasons. First, when cells are saturated with infectious virus, changes in an increase in the number of virions infecting a cell does not lead to a linearly proportional increase in vRNA transcription, which is the critical readout for this assay. Second, selecting a dilution at the beginning of the linear range provides the largest window for detecting decreases in viral barcodes (i.e., virus neutralization). (c) Using each cell density and virus dilution from the experiment in a, we determine the fraction of each strain in the library and the relative fraction for each barcode per strain. This analysis determines if the library is reasonably equally-balanced at the chosen experimental conditions. (d) The fraction of reads corresponding to barcoded RNA spike-in is plotted against different virus dilutions for per-well cell counts of 5e4, 1e5, and 1.5e5 cells. Each point represents a replicate serial dilution of the virus library. As cell concentration increases, the fraction of reads for barcoded RNA spike-in at higher virus library concentrations decreases, demonstrating the library MOI decreases (at the same library dilution factors) with increasing cell concentration. This trend is expected, as the number of infectious particles remains constant at each dilution, while the number of infectable cells increases. (e) At the chosen cell density and library concentration (1.5e5 cells per well and a 1:32 dilution of the virus library), we show the calculated strain fraction in the library (left) and the fraction of each barcode per strain (right) after library balancing and re-pooling (Supplemental Figure 1d). Each row of the barplots represents a different strain. In the strain fraction panel (left), the dotted line indicates the ideal fraction where all strains would be equally represented. In the barcode fraction per strain panel (right), each color represents a different barcode, and the width of each stacked bar indicates the fraction of reads attributed to that barcode for each strain.

Within- and between-plate titer measurements are highly correlated.

(a) The correlation between different barcodes for the same HA measured for a subset of individual sera and a serum pool. Each point represents the neutralization titers measured for two different barcodes for a single virus on the same plate. (b) Correlations between neutralization titers measured in separate experiments performed on separate days. Each point represents the neutralization titer for a single virus-serum pair (i.e., the median of the three replicate barcodes for that virus on each plate) as measured in two different experiments completed and sequenced on different days.

H3 HA phylodynamics, birth year cohorts and age cohorts explain some patterns in neutralization titers.

(a) Frequency of different amino-acid mutations at site 276 among human H3N2 HA sequences over time. Notably, a 276E mutation on the background of 140K (the J subclade-defining mutation) came to define the J.2 subclade. The J.2 subclade predominated during the 2023-2024 season. (b) Frequency of different amino-acid mutations at site 145 among human H3N2 HA sequences over time. (c) Phylogenetic tree of H3N2 HA sequences, where branches are colored based on their amino acid identity at site 145. Nextstrain-defined subclades are labeled at nodes. (d) For each serum sample in the children’s cohort, the geometric mean neutralization titer across 2023-circulating strains was calculated and plotted by birth year. The shaded regions show the interquartile range (IQR) of neutralization titers for each birth year group. (e) The same analysis described in d was performed for the pre-vaccination adult cohort. (f) The distribution of geometric mean neutralization titers taken over all 2023-circulating strains for all sera in each cohort. Each point represents the geometric mean neutralization titer of a single serum against 2023-circulating strains. (g) The distribution of the geometric coefficient of variation (GCV) over all 2023-circulating strains for all sera in each cohort. Each point represents the geometric coefficient of variation of a single serum across 2023-circulating strains. (h) Neutralization titer profiles across all individuals from the children cohort, re-plotting the data from Figure 3b, but now stratified by age groups.

Model fits and additional growth rate comparisons with neutralization titers and evolutionary distances.

(a) The total number of H3N2 influenza sequences collected in 2023 that closely matched each library strain HA1 sequence. In order to make a growth estimate by multinomial logistic regression, we set the threshold of at least 80 sequence counts, where each sequence needed to be an exact match or within one amino-acid mutation of a given library sequence. The reason for this threshold is that it is only possible to estimate growth rates if there are enough sequence counts to reliably determine the frequency trajectory of the strain. Twelve strains met this threshold. (b) Estimated growth rates from multinomial logistic regression and their 95% highest posterior density intervals (HPD95%) relative to a reference strain (A/Massachusetts/18/2022). (c) Correlations between estimated growth rates for the 12 strains and the per-strain median and geometric mean neutralization titers across 95 children and adults. (d) Correlations between the fraction of individuals with low neutralization titers across 95 children and adults and the strain growth rates estimated with a range of cutoffs for how many sequencing counts a strain must have to estimate its growth rate. (e) These plots are comparable to those in Figure 5b and panel d of this figure, but using only the 56 children sera. (f) This plot is identical to that in Figure 5c except it uses titer data for only the 56 children sera. (g,h) The same correlations and analysis as described in e,f, but from titers measured from the 39 pre-vaccination adult sera. (i,j) The same correlations and analysis as described in e,f, but from titers measured from the 39 post-vaccination adult sera. (k) Correlations between titers measured from pooled sera and the median of individually-measured titers for post-vaccination adults. Each dot corresponds to the pooled or median titer for a given 2023-circulating library strain. (l) Correlations between estimated growth rates for the 12 strains and the number of HA ectodomain nucleotide mutations (left) and HA ectodomain amino acid mutations (right).

Acknowledgements

We gratefully acknowledge all data contributors, i.e., the Authors and their Originating laboratories responsible for obtaining the specimens, and their Submitting laboratories for generating the genetic sequence and metadata and sharing via the GISAID Initiative, on which this research is based. An acknowledgement table for all sequences used in our analysis is available here (https://doi.org/10.55876/gis8.250304sc) and a table of GISAID accessions is available here (https://github.com/jbloomlab/flu_seqneut_H3N2_2023-2024/blob/main/non-pipeline_analyses/library_design/gisaid_acknowledgements/epi_for_episet.csv).

This work was supported in part by the NIH/NIAID under award R01AI165821 to TB and JDB, contract 75N93021C00015 to SEH, JDB and TB, and T32AI083203 to CK. JDB and TB are investigators of the Howard Hughes Medical Institute. Additionally, this research was supported by the Genomics & Bioinformatics Shared Resource, RRID:SCR_022606, of the Fred Hutch/University of Washington/Seattle Children’s Cancer Consortium (P30 CA015704), and by Fred Hutch Scientific Computing, NIH grants S10-OD-020069 and S10-OD-028685. Molecular graphics and analyses performed with UCSF ChimeraX, developed by the Resource for Biocomputing, Visualization, and Informatics at the University of California, San Francisco, with support from National Institutes of Health R01-GM129325 and the Office of Cyber Infrastructure and Computational Biology, National Institute of Allergy and Infectious Diseases.

Additional information

Author contributions

Conceptualization: CK, ANL, TB, JDB

Methodology: CK, ANL, JH, MDF, PS, TB, JDB

Software: CK, ANL, JH, MDF, TB, JDB

Investigation: CK

Resources: CK, ANL, TG, EMD, HP, IGB, JAE, SEH

Data curation: JH, TB, JDB

Writing – original draft: CK, JDB

Writing – review and editing: all authors Visualization: CK, JDB

Supervision: JDB

Funding acquisition: TB, JDB

Figures

Figures were generated with Vega-Altair v5.2 (https://altair-viz.github.io/) using custom Python code. Protein structures were visualized with UCSF ChimeraX v1.9102, and protein heatmaps were generated using the byattr command function. Figure panels were compiled and adjusted in Adobe Illustrator v27.0.