Rapidly-evolving sites in the Class I genes.
A) Rapidly-evolving sites are primarily located in exons 2 and 3. Here, the exons are concatenated such that the cumulative position along the coding region is on the x-axis. The dashed orange lines denote exon boundaries. The three genes are aligned such that the same vertical position indicates an evolutionarily equivalent site. The y-axis shows the substitution rate at each site, expressed as a fold-change (the base-2 logarithm of each site’s evolutionary rate divided by the mean rate among mostly-gap sites in each alignment; see Methods). B) Rapidly-evolving sites are located in each protein’s peptide-binding pocket. Structures are Protein Data Bank (Berman et al., 2000) 4BCE (Teze et al., 2014) for HLA-B, 4NT6 (Choo et al., 2014) for HLA-C, and 7P4B (Walters et al., 2022) for HLA-E, with images created in PyMOL (Sch, 2021). Substitution rates for each amino acid are computed as the mean substitution rate of the three sites composing the codon. Orange indicates rapidly-evolving amino acids, while teal indicates conserved amino acids. C) Rapidly-evolving amino acids are significantly closer to the peptide than conserved amino acids. The y-axis shows the BEAST2 substitution rate and the x axis shows the minimum distance to the bound peptide, measured in PyMOL (Sch, 2021). Each point is an amino acid, and distances are averaged over several structures (see Methods). The orange line is a linear regression of substitution rate on minimum distance, with slope and p-value annotated on each panel.