Class I MHC genes present in different species.

The primate evolutionary tree (Kuderna et al., 2023) is shown on the left hand side (nonprimate icons are shown in beige). The MHC region has been well characterized in only a handful of species; the rows corresponding to these species are highlighted in gray. Species that are not highlighted have partially characterized or completely uncharacterized MHC regions. Asterisks indicate new information provided by the present study, typically discovery of a gene’s presence in a species. Each column/color indicates an orthologous group of genes, labeled at the top and ordered as they are in the human genome (note that not all genes appear on every haplotype). A point indicates that a given gene is present in a given species; when a species has 3 or more paralogs of a given gene, only 3 points are shown for visualization purposes. Filled points indicate that the gene is fixed in that species, outlined points indicate that the gene is unfixed, and semi-transparent points indicate that the gene’s fixedness is not known. The shape of the point indicates the gene’s role, either a pseudogene, classical MHC gene, non-classical MHC gene, a gene that shares both features (“dual characteristics”), or unknown. The horizontal gray brackets indicate a breakdown of 1:1 orthology, where genes below the bracket are orthologous to 2 or more separate loci above the bracket. The set of two adjacent gray brackets in the top center of the figure show a block duplication. Gene labels in the middle of the plot (“W”, “A”, “G”, “B”, and “I”) clarify genes that are named differently in different species. OWM, Old-World Monkeys; NWM, New World Monkeys.

Class II MHC genes present in different species.

The mammal evolutionary tree is shown on the left hand side, with an emphasis on the primates (Foley et al., 2023; Kuderna et al., 2023). The rest of the figure design follows that of Figure 1, except that we did not need to limit the number of points shown per locus/species due to space constraints. OWM, Old-World Monkeys; NWM, New World Monkeys; Strep., Strepsirrhini.

The Class I exon 4 multi-gene BEAST2 tree.

The Class I multi-gene tree was constructed using exon 4 (non-PBR) sequences from Class I genes spanning the primates. A) For the purposes of visualization, each clade in the multi-gene tree is collapsed and labeled according to the main species group and gene content of the clade. The white labels on colored rectangles indicate the species group of origin, while the colored text to the right of each rectangle indicates the gene name. The abbreviations are defined in the species key to the right. B) The expanded MHC-F clade (corresponding to the clade in panel A marked by a †). C) The expanded NWM MHC-G clade (marked by a < in panel A). In panels B and C, each tip represents a sequence and is labeled with the species of origin (white label on colored rectangle) and the sequence ID or allele name (colored text to the right of each rectangle; see Appendix 2). The species key is on the right hand side of panel A. Dashed branches have been shrunk to 10% of their original length (to clarify detail in the rest of the tree at this scale). OWM: old-world monkeys; NWM: new-world monkeys; Cat.: Catarrhini—apes and OWM; Pri.: Primates—apes, OWM and NWM; Mam.: mammals—primates and other outgroup mammals.

The Class II exon 3 multi-gene BEAST2 trees.

The trees were constructed using all Class IIA and all Class IIB exon 3 (non-PBR) sequences across all available species. The design of this figure follows Figure 3. A) The top tree shows the collapsed Class IIA gene tree, while the bottom tree shows the collapsed Class IIB gene tree. In this case, all collapsed clades are labeled with “Mam.” for mammals, because sequences from primates and mammal outgroups assort together by gene. B) The expanded MHC-DPA clade (corresponding to the clade in panel A marked by a <). C) The expanded MHC-DPB clade (marked by a † in panel A). D) The expanded MHC-DRB clade (marked by a § in panel A). OWM: old-world monkeys; NWM: new-world monkeys; Cat.: Catarrhini—apes and OWM; Mam.: mammals—primates and other outgroup mammals.

Class I α-block-focused multi-gene BEAST2 trees.

The α-block-focused trees use the common backbone sequences as well as additional sequences from our custom BLAST search of available reference genomes. For the purposes of visualization, some clades are collapsed and labeled with the species group and gene content of the clade (colored text to the right of each rectangle). The white labels on colored rectangles indicate the species group of origin, while the colored text to the right of each rectangle indicates the gene or sequence name (see Appendix 2). The species abbreviations are defined in the species key at the bottom. A) Exon 3 α-block-focused BEAST2 tree with expanded MHC-V clade. B) The expanded MHC-A/AL/OKO/U/Y clade from the exon 3 tree (corresponding to the clade in panel A marked by a <), focusing on MHC-U. C) Exon 4 α-block-focused BEAST2 tree with expanded MHC-K/KL clade. D) The expanded MHC-W/WL/P/T/TL/OLI clade from the exon 4 tree (marked by a † in panel C). OWM: old-world monkeys; NWM: new-world monkeys; Cat.: Catarrhini—apes and OWM; Pri.: Primates—apes, OWM and NWM.

Evolution of the Class I α-block.

The primate evolutionary tree is shown in gray (branches not to scale). The bottom of the tree shows currently known haplotypes in each species or species group. Horizontal gray bars indicate haplotypes shared among the African apes. The history of the genes/haplotypes in the α-block is overlaid on the tree, synthesizing previous work with our own observations (see Methods and Figure 8). Genes are represented by colored rectangles, while haplotypes are shown as horizontal lines containing genes. MHC-F—indicating the telomeric end of the α-block—was fixed early on and is located immediately to the left on all haplotypes shown, but is not pictured due to space constraints. Dashed arrows with descriptive labels represent evolutionary events. In the upper right, the “Symbol Key” explains the icons and labels. The “Gene Relationships” panel shows the relationships between the loci shown on the tree, without the layered complexity of haplotypes and speciation events. The “MHC-A Allelic Lineages” panel shows which MHC-A allele groups are present in human, chimpanzee, and gorilla.

Figure 6—figure supplement 1. A version of Figure 6 with references.

Evolution of MHC-DRB.

The bottom of the tree shows current haplotypes in each species or species group; human, chimpanzee, gorilla, and old-world monkey haplotypes are well characterized, while orangutan, gibbon, and new-world monkey haplotypes are partially known. The history of the genes/haplotypes in the MHC-DRB region is overlaid on the tree, synthesizing previous work with our own observations (see Methods and Figure 8). The rest of the figure design follows that of Figure 6.

Figure 7—figure supplement 1. A version of Figure 7 with references.

BEAST2 trees provide insight into MHC gene and allele relationships.

We first created multi-gene Bayesian phylogenetic trees using sequences from all genes and species, separated into Class I, Class IIA, and Class IIB groups. We then focused on various subtrees of the multi-gene trees by adding more sequences for each subtree and running BEAST2 using only sequences from that group (in addition to the “backbone” sequences common to all trees). Our trees gave us insight into both overall gene relationships (this paper) and allele relationships within gene groups (see our companion paper, Fortier and Pritchard (2024)).