Figures and data

Overview of sequencing-based neutralization assay.
(a) We generate a library of barcoded influenza viruses carrying different HAs, each identified by a unique 16-nucleotide barcode in its genome. See Supplemental Figure 1 for details. (b) The virus library is incubated with sera and then added to MDCK-SIAT1 cells in a 96-well plate. At 12-14 hours post infection, cells are lysed and a known concentration of barcoded RNA spike-in is added to each well. (c) The percent infectivity of each viral variant at each serum concentration is calculated by determining the barcode counts by sequencing. Viral barcode counts are normalized by dividing the viral barcode counts by the barcoded RNA spike-in counts. Percent infectivities are calculated by dividing the normalized barcode counts for each serum concentration by those in the no-serum control wells. These percent infectivities are used to fit neutralization curves and determine the neutralization titer, which is defined as the reciprocal of the serum dilution at which 50% of the virus is neutralized. Note that since each HA is associated with three different barcodes, each neutralization titer is measured in triplicate; we report the median across the three replicates.

The library covers most H3N2 HA diversity in 2023.
(a) Phylogenetic tree of HAs from recent strains in the library, with 2023-circulating strains in blue, cell-passaged vaccine strains in white and egg-passaged vaccine strains in yellow. The cell-passaged vaccine strain A/Massachusetts/18/2022 is identical on the amino-acid level to a strain circulating in 2023, and so is also classified as a 2023-circulating strain in our analyses. See https://nextstrain.org/groups/jbloomlab/seqneut/h3n2-ha-2023-2024 for an interactive view of the full library on a background of the 2-year Nextstrain tree available in November 2023. (b) Phylogenetic tree of HAs from cell- and egg-passaged strains corresponding to H3N2 component of the Northern Hemisphere influenza vaccine from 2014 to 2024 (Supplemental Table 1). Cell-passaged vaccine strains are in white and egg-passaged vaccine strains are in yellow. Virus with the HA from the egg-passaged A/Kansas/14/2017 strain could not be grown in our system, and was excluded. (c) The fraction of all sequenced human H3N2 strains with HAs that closely match HA1 sequences within our library from January 2022 to June 2024. The rolling means (+/-10 days) of total sequence counts and fraction of sequences are plotted, with strains matched in the library in blue and strains not represented by the library in green. A close match was defined as being within one amino acid mutation in the HA1 domain of HA. The HA1 domain encompasses sites where the majority of antigenic evolution takes place45,49,103–106, but a similar analysis with full HA ectodomain sequences produces a qualitatively similar result (Supplemental Figure 2a).

Overview of children and adult cohorts.
Summary of sera from the children and adult vaccination cohorts. Children sera were from routine hospital or clinic visit blood draws, and had limited information about vaccine and exposure histories pulled from electronic health records. Adult sera were collected through vaccine response studies based in the United States of America (USA) and Australia at various time points pre-vaccination and post-vaccination. Detailed metadata can be found at https://github.com/jbloomlab/flu_seqneut_H3N2_2023-2024/tree/main/data/sera_metadata

Neutralization titers to 2023-circulating strains for children and adults vary among individuals and cohorts.
(a) Neutralization titer profiles for a child and a pre-vaccination adult, showing the titer of each serum against each of the 2023-circulating strains in the library. Neutralization by the child’s serum is reduced for strains with the S145N mutation in antigenic region A, while neutralization by the adult’s serum is reduced by multiple mutations within antigenic region C (sites 275 and 276). Strains are grouped phylogenetically on the X-axis. (b) Neutralization titer profiles across all individuals from the children and adult pre-vaccination cohorts. Each thin line is a neutralization titer profile for an individual serum. Each point represents the median neutralization titer across all sera for that strain.

Pooled sera neutralization titer profiles do not capture nuances from individually-measured sera.
Correlations between titers measured from pooled sera and the median of individually-measured titers are plotted for (a) children, (b) pre-vaccination adults, and (c) children and pre-vaccination adults together. Each dot corresponds to the pooled or median titer for a given 2023-circulating library strain. (d) The full neutralization profiles for all children and pre-vaccination adults individually and as a serum pool, replotted from Figure 3b. Titers for individual sera are plotted as thin red lines, with the median titer across all children and adult sera indicated by red points, and the titer of the pooled serum indicated by a black line and points.

Individually-measured serum neutralization titers correlate with the growth rate of viral strains in the human population in 2023.
(a) Strain frequencies and model fits for the strains with sufficient sequencing counts to estimate a growth rate using multinomial logistic regression. Dots represent strain frequencies averaged across a 14-day sliding window, and lines represent the model fit. (b) Phylogenetic tree of the 12 viral strains with sufficient sequencing data to estimate their relative growth rates. Each strain is colored by its growth rate relative to the A/Massachusetts/18/2022 strain. (c) Correlation between the strain growth rates and serum neutralization titers from 95 children and pre-vaccination adults. The plot on the left shows the fraction of sera with titers below 138. The plot at right shows the titers of an equal volume pool of all sera. In both panels, each point corresponds to the growth rate and titer for one of the 12 viral strains. The numbers in the upper corners show the Pearson correlation coefficient R and the P-value as assessed by randomizing the experimental data among strains 200 times. (c) Correlation of strain growth rate with fraction of sera that have a titer below a given threshold for thresholds between 40 and 1000. The thick black line shows the actual correlation at each threshold, and the thin blue lines show the correlations for 200 randomizations of the experimental data among the strains. The threshold that gave the highest correlation is 138; none of the randomizations had a correlation as high as the actual data at any threshold so the P-value is < 0.005. (d) Correlation of strain growth rate and the number of HA1 amino-acid mutations relative to the common ancestor of the 12 strains with growth rate estimates. Correlations are similar in strength for full HA ectodomain amino-acid mutations and HA nucleotide mutations (Supplemental Figure 6l). (f) Correlations between the fraction of sera with titers below 138 and the number of HA1 amino-acid mutations for the 12 strains with estimated growth rates (left) and all 62 of the 2023-circulating strains in the library (right).

HA1 mutations present in strains with estimated growth rates.
HA1 amino-acid mutations for each of the 12 strains with sufficient sequence counts to estimate growth rates. Mutations in known antigenic regions as defined by Muñoz and Deem45 are in bold text, and mutations outside of antigenic regions are in regular weight text. The table lists all HA1 mutations relative to the most-recent common ancestor of the 12 strains.

Neutralization titers to the past decade of vaccine strains.
Neutralization titer profiles to the past decade of cell-passaged Northern Hemisphere vaccine strains across all individuals from the children and adult pre-vaccination cohorts, stratified by age group as indicated in plot panel titles. Strains are ordered on the x-axis by the year they were collected. Each thin line is a neutralization titer profile for an individual serum, and each colored point represents the median titer across all sera for that strain. See Supplemental Table 1 for details on which strains were in the vaccines for which seasons.

Impact of vaccination on neutralization titers.
(a) Neutralization titers pre- and post-vaccination for the USA-based adult cohort (top) and the Australia-based adult cohort (bottom) against the 2023-circulating library strains. Points indicate the median titers across participants and the shaded regions show the interquartile range. The H3N2 component strain of the 2024 Southern Hemisphere seasonal influenza vaccine (A/Massachusetts/18/2022) is indicated by bold text and an asterisk; this strain circulated in 2023 and so was classified as both a 2023-circulating strain and a vaccine strain in our analysis. (b) Neutralization titers pre- and post-vaccination for the two cohorts to each cell-passaged or egg-passaged Northern Hemisphere H3N2 vaccine strain from the past decade (Supplemental Table 1). Viruses are listed on the x-axis in order of increasing year in which they were isolated. The USA-based cohort received the 2023-2024 Northern Hemisphere seasonal influenza vaccine (A/Darwin/9/2021), and the Australia-based cohort received the 2024 Southern Hemisphere seasonal influenza vaccine (A/Massachusetts/18/2022).

Cell-passaged and egg-passaged H3N2 strains chosen for the seasonal human influenza vaccine each season 2014-2024.
For each Northern Hemisphere seasonal influenza vaccine from 2014-2024, we list the cell- and egg-based vaccine strains and the GenBank and GISAID identification numbers linked to each sequence. We also note the amino-acid mutations present in each egg-passaged strain relative to its cell-passaged counterpart from the same vaccine season. Note the egg-passaged virus for the 2019-2020 season is listed (A/Kansas/14/2017X-327), but this strain did not grow to sufficiently high titers to be compatible with our assay and was excluded from the library.

Design of chimeric barcoded HA construct and virus library generation.
All HA gene segments in panels a-c are shown so that the encoded HA protein is oriented N-terminal to C-terminal, which is the 3’ to 5’ orientation of the negative-sense influenza viral genome. Stop codons are indicated with asterisks. (a) A schematic of the unmodified A/WSN/1993(H1N1) HA gene segment and partial upstream and downstream segments of the bidirectional influenza reverse-genetics plasmid97. The segment-specific regions of vRNA required for proper packaging of genome segments into nascent virions, or “packaging signals,” are denoted as red lines spanning both the untranslated regions (UTRs) and HA coding sequence, which includes the signal peptide, ectodomain, transmembrane (TM) and C-terminal (CT) regions. Outside of the HA vRNA-encoding regions, black lines show the upstream pol II promoter (for transcription of viral mRNA) and the downstream pol I promoter (for transcription of negative sense vRNA to be packaged in virus particles) that are found in the bidirectional influenza reverse genetics plasmid97; note these are not part of the actual viral RNAs. (b) A schematic of the chimeric H3 HA barcoded construct, similar to those in Loes et al.43 and Welsh et al.29 Where the HA gene coding sequence exactly match A/WSN/1993(H1N1) (at the upstream HA signal peptide and in most of the duplicated packaging signal) are colored as in panel a. The remainder of the downstream portions of the construct have changed, and are re-colored accordingly. The HA ectodomain is replaced with H3 library HA ectodomain sequences. The downstream TM domain is fixed to an H3 HA consensus sequence. The CT matches the A/WSN/1993(H1N1) protein sequence. Both the H3 TM and the A/WSN/1993(H1N1) CT are synonymously recoded to avoid complementation with the downstream duplicated packaging signal. The stop codon at the end of the HA coding region is duplicated to prevent polymerase read-through. A 16-nucleotide barcode segment follows these dual HA gene stop codons, followed by the Illumina R1 sequence necessary for preparing barcoded amplicons for sequencing. Finally, to preserve proper packaging of the entire gene segment, there is a duplicated packaging signal from A/WSN/1993(H1N1), including the partial TM, CT and downstream UTR regions. As in the constructs described in Loes et al.43, an additional stop codon is engineered that will be in-frame if this duplicated packaging signal replaces the partial packaging signal in the HA coding region; therefore, any HA gene segments that lose the barcoded segment should generate a truncated, non-functional HA protein. (c) A schematic of the construct from which the barcoded RNA spike-in are generated, identical to that in Loes et al. This construct is designed to produce barcoded RNA molecules identical to the barcoded HA segment described in panel b, but with a GFP sequence in place of the HA ectodomain sequence. As in b, the TM and CT domain are from A/WSN/1993(H1N1), and are synonymously recoded in the same manner to provide an identical priming region as incorporated in the HA construct. (d) A schematic of viral library generation, identical to that described in Loes et al.43 To express each HA variant on the surface of influenza virus particles, we use a reverse genetics protocol97 in which influenza virus gene segments are each expressed on separate plasmids and transfected into a co-culture of 293T cells and Madin-Darby Canine Kidney (MDCK) cells overexpressing 2,6-sialyltransferase (SIAT1) and transmembrane protease serine 2 (TMPRSS2). The 3 sequence-confirmed barcoded HA variants per HA sequence are pooled at this stage, and all other non-HA segments are from the lab-adapted H1N1 strain A/WSN/1993. Any changes in neutralization potency can therefore be attributed to an antigenic change within HA. Each per-strain pool of HA barcoded viruses is then passaged on MDCK-SIAT1-TMPRSS2 cells to reduce plasmid carry-over and improve viral titers. After this step, the barcoded HA strains are pooled to create a barcoded HA variant library, which can then be sequenced to assess HA strain balancing. Then, using this information to more equally represent strains, the final, balanced pool is generated.

HA diversity among the strains in the library.
(a) Count and fraction of all sequenced human H3N2 with HAs that closely match HA ectodomain protein sequences in our library. A close match is defined as being within one amino acid mutation in the HA ectodomain. This plot differs from Figure 2a by showing matches for the full HA ectodomain rather than just HA1. (b) Heatmap of pairwise HA1 amino acid sequence Hamming distances between all vaccine strains in the library. (c) Distribution of pairwise HA1 amino acid sequence Hamming distances between all 2023-circulating strains in the library. (d) The shortest HA1 amino acid sequence Hamming distance between each 2023-circulating strain in the library and another 2023-circulating strain. (e) HA1 amino acid sequence Hamming distance between each 2023-circulating strain in the library and the cell-passaged H3 component of the 2024–2025 seasonal influenza vaccine (A/Massachusetts/18/2022). (f) H3 HA trimer with antigenic regions (as defined by Munoz and Deem45) colored and labeled. The H3 HA structure is from A/Victoria/361/2011 (PDB: 4O5N107). (g) Site-wise Shannon entropy calculated from HA ectodomain sequences from all 2023-circulating strains and vaccine strains, mapped onto the same H3 HA trimer as in f. Most high-entropy sites fall within previously defined antigenic regions.

Determining and validating the cell density and viral library concentration for sequencing-based neutralization assays.
(a) To optimize the cell density and virus library dilution for the sequencing-based neutralization assay, we infected varying numbers of cells with serial dilutions of the virus library. While Loes et al.43 used 5e4 cells per well, the assay’s performance at higher cell concentrations was unknown. During the lysis step, we spiked in barcoded RNAs (as shown in Figure 1) to quantify the read counts at each virus library dilution across the tested cell concentrations. (b) The optimal conditions (indicated by the red circle) are those that maximize the number of virus particles added to each well while remaining in the linear range where transcriptional output scales linearly with the amount of infectious virus. By maximizing the number of infectious virus particles added to each well we reduce statistical noise due to bottlenecking of the library. Remaining in the linear range is crucial for several reasons. First, when cells are saturated with infectious virus, changes in an increase in the number of virions infecting a cell does not lead to a linearly proportional increase in vRNA transcription, which is the critical readout for this assay. Second, selecting a dilution at the beginning of the linear range provides the largest window for detecting decreases in viral barcodes (i.e., virus neutralization). (c) Using each cell density and virus dilution from the experiment in a, we determine the fraction of each strain in the library and the relative fraction for each barcode per strain. This analysis determines if the library is reasonably equally-balanced at the chosen experimental conditions. (d) The fraction of reads corresponding to barcoded RNA spike-in is plotted against different virus dilutions for per-well cell counts of 5e4, 1e5, and 1.5e5 cells. Each point represents a replicate serial dilution of the virus library. As cell concentration increases, the fraction of reads for barcoded RNA spike-in at higher virus library concentrations decreases, demonstrating the library MOI decreases (at the same library dilution factors) with increasing cell concentration. This trend is expected, as the number of infectious particles remains constant at each dilution, while the number of infectable cells increases. (e) At the chosen cell density and library concentration (1.5e5 cells per well and a 1:32 dilution of the virus library), we show the calculated strain fraction in the library (left) and the fraction of each barcode per strain (right) after library balancing and re-pooling (Supplemental Figure 1d). Each row of the barplots represents a different strain. In the strain fraction panel (left), the dotted line indicates the ideal fraction where all strains would be equally represented. In the barcode fraction per strain panel (right), each color represents a different barcode, and the width of each stacked bar indicates the fraction of reads attributed to that barcode for each strain.

Within- and between-plate titer measurements are highly correlated.
(a) The correlation between different barcodes for the same HA measured for a subset of individual sera and a serum pool. Each point represents the neutralization titers measured for two different barcodes for a single virus on the same plate. (b) Correlations between neutralization titers measured in separate experiments performed on separate days. Each point represents the neutralization titer for a single virus-serum pair (i.e., the median of the three replicate barcodes for that virus on each plate) as measured in two different experiments completed and sequenced on different days.

H3 HA phylodynamics, birth year cohorts and age cohorts explain some patterns in neutralization titers.
(a) Frequency of different amino-acid mutations at site 276 among human H3N2 HA sequences over time. Notably, a 276E mutation on the background of 140K (the J subclade-defining mutation) came to define the J.2 subclade. The J.2 subclade predominated during the 2023-2024 season. (b) Frequency of different amino-acid mutations at site 145 among human H3N2 HA sequences over time. (c) Phylogenetic tree of H3N2 HA sequences, where branches are colored based on their amino acid identity at site 145. Nextstrain-defined subclades are labeled at nodes. (d) For each serum sample in the children’s cohort, the geometric mean neutralization titer across 2023-circulating strains was calculated and plotted by birth year. The shaded regions show the interquartile range (IQR) of neutralization titers for each birth year group. (e) The same analysis described in d was performed for the pre-vaccination adult cohort. (f) The distribution of geometric mean neutralization titers taken over all 2023-circulating strains for all sera in each cohort. Each point represents the geometric mean neutralization titer of a single serum against 2023-circulating strains. (g) The distribution of the geometric coefficient of variation (GCV) over all 2023-circulating strains for all sera in each cohort. Each point represents the geometric coefficient of variation of a single serum across 2023-circulating strains. (h) Neutralization titer profiles across all individuals from the children cohort, re-plotting the data from Figure 3b, but now stratified by age groups.

Model fits and additional growth rate comparisons with neutralization titers and evolutionary distances.
(a) The total number of H3N2 influenza sequences collected in 2023 that closely matched each library strain HA1 sequence. In order to make a growth estimate by multinomial logistic regression, we set the threshold of at least 80 sequence counts, where each sequence needed to be an exact match or within one amino-acid mutation of a given library sequence. The reason for this threshold is that it is only possible to estimate growth rates if there are enough sequence counts to reliably determine the frequency trajectory of the strain. Twelve strains met this threshold. (b) Estimated growth rates from multinomial logistic regression and their 95% highest posterior density intervals (HPD95%) relative to a reference strain (A/Massachusetts/18/2022). (c) Correlations between estimated growth rates for the 12 strains and the per-strain median and geometric mean neutralization titers across 95 children and adults. (d) Correlations between the fraction of individuals with low neutralization titers across 95 children and adults and the strain growth rates estimated with a range of cutoffs for how many sequencing counts a strain must have to estimate its growth rate. (e) These plots are comparable to those in Figure 5b and panel d of this figure, but using only the 56 children sera. (f) This plot is identical to that in Figure 5c except it uses titer data for only the 56 children sera. (g,h) The same correlations and analysis as described in e,f, but from titers measured from the 39 pre-vaccination adult sera. (i,j) The same correlations and analysis as described in e,f, but from titers measured from the 39 post-vaccination adult sera. (k) Correlations between titers measured from pooled sera and the median of individually-measured titers for post-vaccination adults. Each dot corresponds to the pooled or median titer for a given 2023-circulating library strain. (l) Correlations between estimated growth rates for the 12 strains and the number of HA ectodomain nucleotide mutations (left) and HA ectodomain amino acid mutations (right).