Figures and data
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig1.tif/full/max/0/default.jpg)
Model of forecast horizons and submission lags. A) Long-term forecasting models historically predicted 12 months into the future from April and October because of the time required to develop and distribute a new vaccine (Łuksza and Lässig, 2014). We tested three additional shorter forecast horizons in three-month intervals of 9, 6, and 3 months prior to the same time in the future season. For each forecast horizon, we calculated the accuracy of forecasts under each of the three submission lags reflected above including no lag (blue), realistic lag (green), and ideal lag (orange). B) Observed lags in days between collection of viral samples and submission of corresponding HA sequences to GISAID (purple) for samples collected in 2019 have a mean of 98 days (approximately 3 months). A gamma distribution fit to the observed lag distribution with a similar mean and shape (green) represents a realistic submission lag that we sampled from to assign “submission dates” to simulated and natural A/H3N2 populations. A gamma distribution with a mean that is one third of the realistic distribution (orange) represents an ideal submission lag analogous to the 1-month average observed lags for SARS-CoV-2 genomes. Retrospective analyses including fitting of forecasting models typically filter HA sequences by collection date instead of submission dates in which case there is no lag (blue).
Figure 1—figure supplement 1. Distribution of submission lags in days for the pre-pandemic era (2019-2020) and pandemic era (2022-2023)
Figure 1—figure supplement 2. Number and proportion of A/H3N2 sequences available per timepoint and lag type
Figure 1—figure supplement 3. Number and proportion of simulated A/H3N2-like sequences available per timepoint and lag type.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig2.tif/full/max/0/default.jpg)
Distance to the future per timepoint (AAs) for natural A/H3N2 populations by forecast horizon and submission lag type based on forecasts from the local branching index (LBI) and mutational load model. Each point represents a future timepoint whose population was predicted from the number of months earlier corresponding to the forecast horizon. Points are colored by submission lag type including forecasts made with no lag (blue), an ideal lag (orange), and a realistic lag (green).
Figure 2—figure supplement 1. Distance to the future for simulated A/H3N2-like populations
Figure 2—figure supplement 2. Optimal distance to the future for natural A/H3N2 populations
Figure 2—figure supplement 3. Optimal distance to the future for simulated A/H3N2-like populations
Figure 2—source code 1. Jupyter notebook used to produce this figure and the supplemental figure lives in workflow/notebooks/plot-distances-to-the-future-by-delay-type-and-horizon-for-population.py.ipynb.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_tbl1.tif/full/max/0/default.jpg)
Distance to the future in amino acids (mean +/− standard deviation AAs) by forecast horizon (in months) and submission lag for A/H3N2 populations.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig3.tif/full/max/0/default.jpg)
Clade frequency errors for natural A/H3N2 clades at the same timepoint calculated as the difference between clade frequencies without submission lag and corresponding frequencies with either A) ideal or B) realistic submission lags. Distributions of frequency errors appear normally distributed in both lag scenarios for both C) small clades (>0% and <10% frequency) and D) large clades (≥10%). Dashed lines indicate the median error from the distribution of the lag type with the same color.
Figure 3—figure supplement 1. Current clade frequency errors for simulated A/H3N2-like populations
Figure 3—source code 1. Jupyter notebook used to produce this figure and the supplemental figure lives in workflow/notebooks/plot-current-clade-frequency-errors-by-delay-type-for-populations.py.ipynb.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig4.tif/full/max/0/default.jpg)
Absolute forecast clade frequency errors for natural A/H3N2 populations by forecast horizon in months and submission lag type (none, ideal, or observed) for A) small clades (<10% initial frequency) and B) large clades (≥10% initial frequency).
Figure 4—figure supplement 1. Absolute forecast clade frequency errors for simulated A/H3N2-like populations.
Figure 4—figure supplement 2. Forecast clade frequency errors for natural A/H3N2 populations.
Figure 4—figure supplement 3. Forecast clade frequency errors for simulated A/H3N2-like populations.
Figure 4—source code 1. Jupyter notebook used to produce this figure and the supplemental figures lives in workflow/notebooks/plot-forecast-clade-frequency-errors-by-delay-type-and-horizon-for-population.py.ipyn
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_tbl2.tif/full/max/0/default.jpg)
Errors in clade frequencies between observed and predicted values by forecast horizon (in months) and submission lag for A/H3N2 clades with an initial frequency ≥10% under the given lag scenario.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig5.tif/full/max/0/default.jpg)
Improvement of clade frequency errors for A/H3N2 populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions of improved vaccine development (reducing 12-month to 6-month forecast horizon), improved surveillance (reducing submission lags from 3 months on average to 1 month), or a combination of both interventions. We measured improvements from the status quo as the difference in total absolute clade frequency error per future timepoint. Positive values indicate increased forecast accuracy, while negative values indicate decreased accuracy. Each point represents the improvement of forecasts for a specific future timepoint under the given intervention. Horizontal dashed lines indicate median improvements. Horizontal dotted lines indicate upper and lower quartiles of improvements.
Figure 5—figure supplement 1. Distribution of total absolute clade frequency errors summed across clades per future timepoint for A/H3N2 populations.
Figure 5—figure supplement 2. Improvement of clade frequency errors for simulated A/H3N2-like populations.
Figure 5—figure supplement 3. Distribution of total absolute clade frequency errors summed across clades per future timepoint for simulated A/H3N2-like populations.
Figure 5—figure supplement 4. Improvement of distances to the future (AAs) for A/H3N2 populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions.
Figure 5—figure supplement 5. Improvement of distances to the future (AAs) for simulated A/H3N2-like populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions.
Figure 5—source code 1. Jupyter notebook used to produce effects of interventions on total absolute clade frequency errors workflow/notebooks/plot-forecast-clade-frequency-errors-by-delay-type-and-horizon-for-population.py.ipyn
Figure 5—source code 2. Jupyter notebook used to produce effects of interventions on distances to the future lives in workflow/notebooks/plot-distances-to-the-future-by-delay-type-and-horizon-for-population.py.ipynb.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_tbl3.tif/full/max/0/default.jpg)
Improvement in A/H3N2 clade frequency forecast accuracy under realistic interventions of improved vaccine development (reducing 12-month to 6-month forecast horizon), improved surveillance (reducing submission lags from 3 months on average to 1 month), or a combination of both interventions. We measured improvements from the status quo (12-month forecast horizon and 3-month average submission lag) as the difference in total absolute clade frequency error per future timepoint and the number and proportion of future timepoints for which forecasts improved under the intervention.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig6.tif/full/max/0/default.jpg)
Improvement of optimal distances to the future (AAs) for A/H3N2 populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions of improved vaccine development (reducing 12-month to 6-month forecast horizon), improved surveillance (reducing submission lags from 3 months on average to 1 month), or a combination of both interventions. We measured improvements from the status quo as the difference in optimal distances to the future per future timepoint. Positive values indicate increased forecast accuracy, while negative values indicate decreased accuracy. Each point represents the improvement of forecasts for a specific future timepoint under the given intervention. Horizontal dashed lines indicate median improvements. Horizontal dotted lines indicate upper and lower quartiles of improvements.
Figure 6—figure supplement 1. Improvement of optimal distances to the future (AAs) for simulated A/H3N2-like populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions.
Figure 6—source code 1. Jupyter notebook used to produce optimal effects of interventions on distances to the future lives in workflow/notebooks/plot-distances-to-the-future-by-delay-type-and-horizon-for-population.py.ipynb.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig1s1.tif/full/max/0/default.jpg)
Distribution of submission lags in days for the pre-pandemic era (2019-2020 in blue) and pandemic era (2022-2023 in orange). Vertical dashed lines represent mean lags for each distribution.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig1s2.tif/full/max/0/default.jpg)
A) Number of A/H3N2 sequences available per timepoint and lag type. B) Proportion of all A/H3N2 sequences without lag per timepoint and lag type.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig1s3.tif/full/max/0/default.jpg)
A) Number of simulated A/H3N2-like sequences available per timepoint and lag type. B) Proportion of all simulated A/H3N2-like sequences without lag per time-point and lag type.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig2s1.tif/full/max/0/default.jpg)
Distance to the future per timepoint (AAs) for simulated A/H3N2-like populations by forecast horizon and submission lag type based on forecasts from the “true fitness” model.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig2s2.tif/full/max/0/default.jpg)
Optimal distance to the future per timepoint (AAs) for natural A/H3N2 populations by forecast horizon and submission lag type based on posthoc empirical fitness of the initial population.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig2s3.tif/full/max/0/default.jpg)
Optimal distance to the future per timepoint (AAs) for simulated A/H3N2-like populations by forecast horizon and submission lag type based on posthoc empirical fitness of the initial population.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig3s1.tif/full/max/0/default.jpg)
Clade frequency errors between simulated A/H3N2-like HA populations with ideal or realistic submission lags and populations without any submission lag.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig4s1.tif/full/max/0/default.jpg)
Absolute forecast clade frequency errors for simulated A/H3N2-like HA populations by forecast horizon in months and submission lag type (none, ideal, or realistic) for A) small clades (<10% initial frequency) and B) large clades (≥10% initial frequency).
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig4s2.tif/full/max/0/default.jpg)
Forecast clade frequency errors for natural A/H3N2 HA populations by forecast horizon in months and submission lag type (none, ideal, or realistic) for A) small clades (<10% initial frequency) and B) large clades (≥10% initial frequency).
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig4s3.tif/full/max/0/default.jpg)
Forecast clade frequency errors for simulated A/H3N2-like HA populations by forecast horizon in months and submission lag type (none, ideal, or realistic) for A) small clades (<10% initial frequency) and B) large clades (≥10% initial frequency).
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig5s1.tif/full/max/0/default.jpg)
Distribution of total absolute clade frequency errors summed across clades per future timepoint for A/H3N2 populations. We calculated the effects of interventions as the difference between these values per future timepoint under the status quo (12-month forecast horizon and realistic submission lag) and specific interventions.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig5s2.tif/full/max/0/default.jpg)
Improvement of clade frequency errors for simulated A/H3N2-like populations between the status quo and realistic interventions.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig5s3.tif/full/max/0/default.jpg)
Distribution of total absolute clade frequency errors summed across clades per future timepoint for simulated A/H3N2-like populations.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig5s4.tif/full/max/0/default.jpg)
Improvement of distances to the future (AAs) for A/H3N2 populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions. The effects of interventions are the differences between distances to the future per future timepoint under the status quo and specific interventions.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig5s5.tif/full/max/0/default.jpg)
Improvement of distances to the future (AAs) for simulated A/H3N2-like populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions. The effects of interventions are the differences between distances to the future per future timepoint under the status quo and specific interventions.
![](https://prod--epp.elifesciences.org/iiif/2/104282%2Fv1%2Fcontent%2F24313489v1_fig6s1.tif/full/max/0/default.jpg)
Improvement of optimal distances to the future (AAs) for simulated A/H3N2-like populations between the status quo (12-month forecast horizon and realistic submission lags) and realistic interventions.