Strains are colored by their amino acid sequence composition such that genetically similar strains have similar colors (Materials and methods). (A) Strains at timepoint t, , are shown in their phylogenetic context and sized by their frequency at that timepoint. The estimated future population at timepoint u, , is projected to the right with strains scaled in size by their projected frequency based on the known fitness of each simulated strain. (B) The frequency trajectories of strains at timepoint t to u represent the predicted the growth of the dark blue strains to the detriment of the pink strains. (C) Strains at timepoint u, , are shown in the corresponding phylogeny for that timepoint and scaled by their frequency at that time. (D) The observed frequency trajectories of strains at timepoint u broadly recapitulate the model’s forecasts while also revealing increased diversity of sequences at the future timepoint that the model could not anticipate, e.g. the emergence of the light blue cluster from within the successful dark blue cluster. Model coefficients minimize the earth mover’s distance between amino acid sequences in the observed, , and estimated, , future populations across all training windows.