Ancient trans-species polymorphism at the Major Histocompatibility Complex in primates

eLife Assessment

This important manuscript presents a thorough analysis of trans-specific polymorphism (TSP) in Major Histocompatibility Complex gene families across primates. The analysis makes the most of currently available genomic data and methods to substantially increase the amount and evolutionary time that TSPs can be observed. Both false negative TSPs due to missing genes at the assembly and/or annotation level, as well as false positives due to read mismapping with missing paralogs, are well assessed and discussed. Overall the evidence provided is compelling, and the manuscript clearly delineates the path for future progress on the topic.

https://doi.org/10.7554/eLife.103547.3.sa0

Significance of the findings:

Important: Findings that have theoretical or practical implications beyond a single subfield

Landmark
Fundamental
Important
Valuable
Useful

Strength of evidence:

Compelling: Evidence that features methods, data and analyses more rigorous than the current state-of-the-art

Exceptional
Compelling
Convincing
Solid
Incomplete
Inadequate

During the peer-review process the editor and reviewers write an eLife Assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife Assessments

Abstract
Introduction
Results
Discussion
Materials and methods
Appendix 1
Data availability
References
Article and author information
Metrics

Abstract

Classical genes within the Major Histocompatibility Complex (MHC) are responsible for peptide presentation to T cells, thus playing a central role in immune defense against pathogens. These genes are subject to strong selective pressures including both balancing and directional selection, resulting in exceptional genetic diversity—thousands of alleles per gene in humans. Moreover, some allelic lineages appear to be shared between primate species, a phenomenon known as trans-species polymorphism (TSP) or incomplete lineage sorting, which is rare in the genome overall. However, despite the clinical and evolutionary importance of MHC diversity, we currently lack a full picture of primate MHC evolution. In particular, we do not know to what extent genes and allelic lineages are retained across speciation events. To start addressing this gap, we explore variation across genes and species in our companion paper (Fortier and Pritchard, 2025), and here we explore variation within individual genes. We used Bayesian phylogenetic methods to determine the extent of TSP at 17 MHC genes, including classical and non-classical Class I and Class II genes. We find strong support for ancient TSP in 7 of 10 classical genes, including—remarkably—between humans and old-world monkeys in MHC-DQB1. In addition to the long-term persistence of ancient lineages, we additionally observe rapid evolution at nucleotides encoding the proteins’ peptide-binding domains. The most rapidly-evolving amino acid positions are extremely enriched for autoimmune and infectious disease associations. Together, these results suggest complex selective forces—arising from differential peptide binding—that drive short-term allelic turnover within lineages while also maintaining deeply divergent lineages for at least 31 million years in some cases.

Introduction

The Major Histocompatibility Complex (MHC) is a large locus containing many immune genes that is shared among the jawed vertebrates (Radwan et al., 2020). In humans, the MHC is also known as the HLA (Human Leukocyte Antigen) region; it spans about 5 megabases (Mb) on chromosome 6 and contains 412 genes (Figure 1A; Genome Reference Consortium, 2022; O’Leary et al., 2016). Many of these are part of the MHC gene family, a large group of evolutionarily related genes with varying functions. The ‘classical’ MHC genes are responsible for presenting protein fragments for inspection by T cells. MHC peptide presentation allows T cells to monitor the body for the presence of foreign peptides, which might indicate infection or cancer; this is crucial for vertebrate immune surveillance (Neefjes et al., 2011). ‘Non-classical’ MHC genes are essential to the innate immune system, where they perform a variety of niche roles. See the appendices of our companion paper (Fortier and Pritchard, 2025) for more detail.

Figure 1 with 2 supplements see all

Download asset Open asset

The MHC region in humans (HLA).

(A) Each point at top represents the location of a gene. The different types of HLA genes are distinguished by different colors, shown in the key at left. The 19 functional HLA genes are labeled with their name (omitting their ‘HLA’ prefix due to space constraints). Gray points represent non-HLA genes and pseudogenes in the region. The black line shows nucleotide diversity (Nei and Li’s π) across the region, while the pink horizontal line shows the genome-wide average nucleotide diversity ( $π \approx 0.001$ ) (Sachidanandam et al., 2001). (B) Nucleotide diversity around classical Class I gene HLA-A, with exon structure shown. (C) Nucleotide diversity around classical Class II gene HLA-DRB1, with exon structure shown. (D) Species tree showing the phylogenetic relationships among selected primates from this study (Kuderna et al., 2023). The colors of the icons are consistent with colors used throughout the paper to distinguish species. The pink vertical dashed lines indicate split times of the new-world monkeys (NWM) from the apes/old-world monkeys (OWM) (39 MYA), OWM from the apes (31 MYA), and the lesser apes (gibbons) from the great apes (23 MYA).

The MHC locus is extraordinarily polymorphic. Haplotypes can vary widely in gene content, and thousands of distinct alleles are observed at the classical genes in humans and other primates (Maccari et al., 2017; Maccari et al., 2020; Robinson et al., 2019). Different alleles are functionally diverse, with distinct peptide-binding affinities and, consequently, allelic differences in pathogen detection (Neefjes et al., 2011; Adams and Luoma, 2013). Given this huge diversity of functionally distinct alleles, the MHC is by far the most important locus in the genome for inter-individual variation in both infectious and autoimmune disease risk, with thousands of GWAS hits (Buniello et al., 2019; Smith et al., 2024). In our companion paper (Fortier and Pritchard, 2025), we built large multi-gene trees to explore the relationships between the different classical and non-classical genes. Here, we look within 17 specific genes—representing classical, non-classical, Class I, and Class II —to characterize trans-species polymorphism, a phenomenon characteristic of long-term balancing selection.

Historically, the MHC provided some of the first clear examples of positive selection in early studies of molecular evolution. By the 1980s and 1990s, researchers had noted an excess of missense variants (i.e. $d N / d S > 1$ ) in the peptide-binding regions of classical MHC genes (Hughes and Nei, 1988; Hughes and Nei, 1989), alleles shared across species (Arden and Klein, 1982; Mayer et al., 1988), and high nucleotide diversity across the region (Wakeland et al., 1987; Nei and Hughes, 1991) in rodents and primates. Indeed, modern data show that nucleotide diversity in the human MHC (HLA region) exceeds 70 times the genome-wide average near the classical genes, suggesting ancient balancing selection (Figure 1A–C). Meanwhile, the MHC also features prominently in genome-wide scans for short-term directional selection (Mathieson et al., 2015; Field et al., 2016; Allentoft et al., 2022; Cong et al., 2022; Okada et al., 2018; Yasumizu et al., 2020).

In the present paper, we explore a particularly striking feature of the selection signals at MHC, namely the evidence for extremely deep coalescence structure. Some alleles (haplotypes) are more closely related to corresponding alleles from another species than they are to distinct alleles from their own species. This phenomenon is referred to as trans-species polymorphism (TSP).

TSP is rare overall in humans. Across most of the genome, human alleles coalesce to a common ancestor well within the human lineage, typically around 2 million years (MY) ago (Mallick et al., 2016). Indeed, only ∼100 loci genome-wide show compelling evidence for sharing of ancestral alleles between humans and our closest relatives, chimpanzees (Leffler et al., 2013). TSP among humans and more distantly related species is even rarer; besides the MHC, the only other clear example of deep TSP is at the ABO locus (which influences blood type; Azevedo et al., 2015). At this locus, both the A and B alleles are shared by descent throughout the apes, implying that the A and B lineages date back to at least the divergence point of humans and gibbons 23 MY ago (Ségurel et al., 2012; Kuderna et al., 2023). Such deep coalescence is extraordinarily unlikely under a neutral model, and instead points to some form of balancing selection.

Meanwhile, TSP is evident at multiple MHC genes and in many different phylogenetic clades. TSP at this locus was first proposed in the 1980s on the basis of unusual sequence similarity between mice and rats (Klein, 1980; Arden and Klein, 1982; Klein, 1987; Figueroa et al., 1988; McConnell et al., 1988; Wakeland et al., 1987), and between humans and chimpanzees (Lawlor et al., 1988; Mayer et al., 1988). Later work has reported likely TSP between humans and apes (Slierendregt et al., 1995; Boyson et al., 1996; McKenzie et al., 1999; Wroblewski et al., 2017) and humans and old world monkeys (Mayer et al., 1992; Brändle et al., 1992; Kupfermann et al., 1992; Slierendregt et al., 1992; Geluk et al., 1993; Satta et al., 1996; Otting et al., 2000; Kriener et al., 2000; Gyllensten et al., 1990; Kriener et al., 2001; Otting et al., 1992; Otting and Bontrop, 1995; Otting et al., 1992; Otting et al., 2002); deep TSP is also consistent with the high levels of genetic diversity within the MHC. Such ancient TSP would make the MHC unique compared to any other locus in the genome. However, most previous work has not fully accounted for the inherent uncertainty in phylogenetic inference, especially given the potential for convergent evolution at functional sites. Although there is clear evidence for TSP, its exact age at each gene is still uncertain.

To address these questions, we used data from the IPD-MHC/HLA database—a large repository for MHC allele sequences from humans, non-human primates, and other vertebrates—along with supplementary sequences from NCBI RefSeq (Maccari et al., 2017; Maccari et al., 2020; Robinson et al., 2019). This represents the most complete sampling of primate MHC genes to date, spanning the entire primate tree (Figure 1D; Tables 2–4). We account for the uncertainty in phylogenetic inference using a Bayesian MCMC approach (BEAST2), which is well-suited to handle highly variable and rapidly-evolving sequences. In our companion paper (Fortier and Pritchard, 2025), we built trees to compare genes across dozens of species. When paired with previous literature, these trees helped us infer orthology and assign sequences to genes in some cases. That process helped inform this work, where we assess support for TSP within individual genes.

We find support for TSP among the African apes for genes MHC-C, -DPA1, and -DRB3, among the great apes for MHC-DPB1, and among all apes for MHC-B. We also find conclusive evidence for TSP at least back to the ancestor of humans and OWM in MHC-DQB1, implying—remarkably—that allelic lineages have been maintained by balancing selection for at least 31 MY. Rapidly-evolving sites are mainly located in the critical peptide-binding regions of the classical genes, but are spread throughout the coding region of the non-classical genes. Moreover, the most rapidly-evolving sites are also frequently associated with immune phenotypes and diseases in the literature, connecting our evolutionary findings with their functional consequences. These results highlight the contrasting roles of ancient balancing selection and short-term directional selection within the peptide-binding regions of the classical genes and motivate further evolutionary and functional studies to better understand this unique system.

Results

Data

We collected MHC nucleotide sequences for all genes from the IPD-MHC/HLA database, a large repository for MHC alleles from humans, non-human primates, and other vertebrates (Maccari et al., 2017; Maccari et al., 2020; Robinson et al., 2024). Although extensive, this database includes few or no sequences from important primates such as the gibbon, tarsier, and lemur. Thus, we supplemented our set of alleles using sequences from NCBI RefSeq (O’Leary et al., 2016). Because the MHC genes make up an evolutionarily related family, they can all be aligned (Kaufman, 2022; Adams and Luoma, 2013). In our companion paper (Fortier and Pritchard, 2025), we utilized these large multi-gene alignments for Class I, Class IIA, and Class IIB to compare genes. Here, we analyze subsets of those alignments, each focusing on a single gene or group of closely related genes.

We considered 16 gene groups spanning MHC classes and functions. These include the classical Class I genes (MHC-A-related, MHC-B-related, MHC-C-related), non-classical Class I genes (MHC-E-related, MHC-F-related, MHC-G-related), classical Class IIA genes (MHC-DRA-related, MHC-DQA-related, MHC-DPA-related), classical Class IIB genes (MHC-DRB-related, MHC-DQB-related, MHC-DPB-related), non-classical Class IIA genes (MHC-DMA-related, MHC-DOA-related), and non-classical Class IIB genes (MHC-DMB-related, MHC-DOB-related). See Tables 2–5 for a breakdown of the sequences from each species included in each group. We studied two or three different genic regions for each group: exon 2 alone, exon 3 alone, and (for Class I) exon 4 alone. Exons 2 and 3 encode the peptide-binding region (PBR) for the Class I proteins, and exon 2 alone encodes the PBR for the Class II proteins. For the Class I genes, we also considered exon 4 alone because it is comparable in size to exons 2 and 3 and provides a good contrast to the PBR-encoding exons. Because few intron sequences were available for non-human species, we did not include them in our analyses.

Trans-species polymorphism is widespread

For each gene group and genic region, we used the Bayesian phylogenetics software BEAST2 (Bouckaert et al., 2014; Bouckaert et al., 2019) with package SubstBMA (Wu et al., 2013) to infer phylogenies. One major advantage of BEAST2 over less tunable methods is that it can allow evolutionary rates to vary across sites, which is important for genes such as these which experience rapid evolution in functional regions (Wu et al., 2013). We also considered each exon separately to minimize the impact of recombination as well as to compare and contrast the binding-site-encoding exons with non-binding-site-encoding exons.

We can visualize each set of phylogenies as a single summary tree, which maximizes the product of posterior clade probabilities (BEA, 2024). Three of these summary trees are shown in Figure 2, constructed from the second exons of classical Class I gene MHC-C, classical Class II gene MHC-DQB, and non-classical Class II gene MHC-DOA, respectively (see Figure 2—figure supplements 1–16, Figure 3—figure supplements 1–12, Figure 4—figure supplements 1–10 for the other exons and genes).

Figure 2 with 16 supplements see all

Download asset Open asset

*BEAST2 allele summary trees using sequences from exon 2*.

(A) MHC-C, (B) MHC-DQB1, and (C) MHC-DOA. Each tip represents an allele, with color and four-letter abbreviation representing the species (see Figure 1—figure supplement 2 for full species key). The species label is followed by the allele name (see Appendix 1 for more details on nomenclature) or RefSeq accession number. For simplicity, monophyletic groups of similar alleles are collapsed with a triangle and labeled with their one-field allele name. The color/abbreviation key (center) also depicts the species tree (Kuderna et al., 2023). Human alleles (HLA; red) are bolded for emphasis. Dashed outgroup branches are scaled by a factor of $\frac{1}{10}$ to clarify tree structure within the clade of interest. The smaller inset tree in panel B highlights the relationships between two human allele groups (red) and two OWM allele groups (green). The indicated human and OWM lineages coalesce more recently between groups than within each group. Pri., primate backbone sequences; Mam., mammal outgroup sequences.

MHC-C is a classical Class I gene that duplicated from MHC-B in the ancestor of the great apes (Piontkivska, 2003; Fukami-Kobayashi et al., 2005; Abi-Rached et al., 2010; Adams and Parham, 2001; Lugo and Cadavid, 2015). Its protein product participates in classical antigen presentation and also serves as the dominant Class I molecule for interacting with killer cell immunoglobulin-like receptors (KIRs) in innate immunity (Adams and Parham, 2001; Guethlein et al., 2015; Vollmers et al., 2021). MHC-DQB is a classical Class II gene which pairs with MHC-DQA. Apes have two MHC-DQ copies, MHC-DQA1/MHC-DQB1 and MHC-DQA2/MHC-DQB2, while the second copy was deleted in OWM. NWM can have two or three sets of MHC-DQ genes, depending on species, but it has been unclear whether any of them are 1:1 orthologous with the ape or OWM genes. Lastly, MHC-DOA is a non-classical Class II gene whose protein product modulates MHC-DM activity, indirectly affecting Class II peptide presentation (Heijmans et al., 2020; Neefjes et al., 2011). The genes’ differing roles result in different patterns in the phylogenetic trees.

Critically, we observe that, at classical genes MHC-C (Figure 2A) and MHC-DQB (Figure 2B), the alleles fail to cluster together according to species, as indicated by the mixed-color clades throughout the trees. In MHC-C, human (HLA; red rectangles), chimpanzee (Patr; dark pink), bonobo (Papa; light pink), and even gorilla (Gogo; orange) alleles can be found throughout the tree, indicating that variation in this gene is almost as old as the gene itself.

Figure 2B displays the BEAST2 tree consisting of MHC-DQB1, -DQB2, and outgroup -DQB alleles all together. It shows many mixed-color clades throughout, consisting of ape (red/orange/yellow rectangles) and OWM (green) alleles grouping together. Alleles often group by first-field name instead of by species, indicating that some allelic lineages have been maintained since before the split of humans and OWM—at least 31 MY. An example of this is shown in the inset to the left of this tree, ‘Example of Human-OWM TSP’. Here, human alleles coalesce with OWM alleles before they coalesce with each other. Near the bottom of the tree is a clade consisting of ape and NWM MHC-DQB2 sequences, suggesting that they are orthologous. However, NWM species have expanded their MHC-DQ regions, so these genes may not actually be 1:1 orthologous (see our companion paper, Fortier and Pritchard, 2025). Additionally, Strepsirrhini sequences do not group with either the MHC-DQB1 or -DQB2 clade, showing that the duplications of the MHC-DQB genes must have happened in or after the Simiiformes ancestor.

Figure 2C shows the BEAST2 summary tree for non-classical MHC-DOA. In this tree, alleles group exclusively by species (clades are collapsed for clarity) and the branching order of the species deviates only slightly from the species tree. This shows that not all MHC genes are affected by long-term balancing selection, despite the complicated linkage disequilibrium across classical and non-classical genes in the region (Smith et al., 2024). This also suggests that antigen presentation specifically, as opposed to a general role in immune function, is the driving force behind this long-term balancing selection.

While the BEAST2 summary trees in Figure 2 are suggestive of deep TSP, they do not directly quantify the statistical confidence in the TSP model. Moreover, standard approaches to quantifying uncertainty in trees, such as bootstrap support or posterior probabilities for specific clades, do not relate directly to hypothesis testing for TSP. We therefore implemented an alternative approach using BEAST2 output, as follows (see the Materials and methods; Bayes factors for details).

We performed formal model testing for TSP within quartets of alleles, where two alleles are taken from a species (or taxon) A, and two alleles are taken from a different species (or taxon) B. If the alleles from A group together (and the alleles from B group together) in the unrooted tree, this quartet supports monophyly of A (and of B). In a neutral genealogy, monophyly of each species’ sequences is expected. But if alleles from A group more closely with alleles from B in the unrooted tree, then this comparison supports TSP. Since BEAST2 samples from the posterior distribution of trees, we counted the number of trees that support TSP versus the number that support monophyly as an estimate of the posterior support for each model. We then summarized the relative support for each model by converting these to Bayes factors (see Materials and methods; Bayes factors for more detail). The precise interpretation of Bayes factors depends on one’s prior expectation; however, following standard guidelines (Jeffreys, 1998), we suggest that Bayes factors >100 should be considered as strong support in favor of TSP. Bayes factors <1 are evidence against TSP. For each comparison of two taxa, we report the maximum Bayes factor across the possible quartets, as we are interested in whether any quartet shows compelling evidence for TSP.

Gene conversion, the unidirectional transfer of short tracts of DNA from a donor to an acceptor sequence, can affect the inferred trees. In particular, acceptor sequences may group more strongly with donor sequences than with sequences that share DNA by descent. This can make it difficult to distinguish trees influenced by trans-species polymorphism from those influenced by gene conversion. Thus, we inferred gene conversion tracts using GENECONV (Sawyer, 1999) and excluded significant gene-converted acceptor alleles from the Bayes factor calculations. While GENECONV cannot possibly infer all past events, this procedure should ameliorate any biasing effects. Additionally, we consider each exon separately; analyzing short tracts reduces the effect of recombination on the tree (see our companion paper for more specifics; Fortier and Pritchard, 2025). Note that the number of sequences available for comparison also affects the detectability of TSP. For example, if the only sequences available are from the same allelic lineage, they will coalesce more recently in the past than they would with alleles from a different lineage and would not show evidence for TSP. This means our method is well-suited to detect TSP when a diverse set of allele sequences is available, but it is conservative when there are few alleles to test. There were few available alleles for some non-classical genes, such as MHC-F, and some species, such as gibbon. This uneven sampling of taxa means that some TSPs cannot be detected at this time.

Bayes factors are shown in Figures 3 and 4. See Figure 2—figure supplements 1–16, Figure 3—figure supplements 1–12, and Figure 4—figure supplements 1–10 for examples of high-Bayes-factor quartets for each comparison.

Figure 3 with 14 supplements see all

Download asset Open asset

Strong support for TSP at Class I genes MHC-B and -C.

Bayes factors computed over the set of *BEAST* trees indicate deep TSP. Different species comparisons are listed on the y-axis, and different gene regions are listed on the x-axis. Each table entry is colored and labeled with the maximum Bayes factor among all tested quartets of alleles belonging to that category. High Bayes factors (orange) indicate support for TSP among the given species for that gene region, while low Bayes factors (teal) indicate that alleles assort according to the species tree, as expected. Bayes factors above 100 are considered decisive. Tan values show poor support for either hypothesis, while white boxes indicate that there are not enough alleles in that category with which to calculate Bayes factors. MHC-A is not present in the NWMs, and MHC-C was not present before the human-orangutan ancestor, so it is not possible to calculate Bayes factors for these species comparisons.

Figure 4 with 12 supplements see all

Download asset Open asset

Strong support for TSP at the classical Class II genes.

Bayes factors computed over the set of *BEAST* trees indicate deep TSP. Different species comparisons are listed on the y-axis, and different gene regions are listed on the x-axis. Each table entry is colored and labeled with the maximum Bayes factor among all tested quartets of alleles belonging to that category. High Bayes factors (orange) indicate support for TSP among the given species for that gene region, while low Bayes factors (teal) indicate that alleles assort according to the species tree, as expected. Bayes factors above 100 are considered decisive. Tan values show poor support for either hypothesis, while white boxes indicate that there are not enough alleles in that category with which to calculate Bayes factors.

At the Class I genes (Figure 3), MHC-C shows strong support for TSP within the African apes: human, chimpanzee, and gorilla. Having arisen fairly recently in the ancestor of human and orangutan, MHC-C has thus maintained some allelic lineages for most of its history. TSP has not previously been reported for this gene.

For MHC-A, Bayes factors vary considerably depending on exon and species pair. Past work suggests that this gene has had a long history of gene conversion affecting different exons, resulting in different evolutionary histories for different parts of the gene (Hans et al., 2017; Gleimer et al., 2011; Adams and Parham, 2001). Indeed, we excluded many MHC-A sequences from our Bayes factor calculations because they were identified as gene-converted in our GENECONV analysis or were previously suggested to be recombinants. As shown in Figure 3, the lack of concordance in Bayes factors across the different exons for MHC-A is evidence for gene conversion, rather than balancing selection, being the most important factor in this gene’s evolution. In contrast, the other gene groups generally show concordance in Bayes factors across exons. We interpret this as evidence in favor of TSP being the primary driver of the observed deep coalescence structure for MHC-B and -C (rather than recombination or gene conversion).

The non-classical Class I genes MHC-E, -F, and -G (bottom row of Figure 3) are interspersed with the classical Class I genes in the MHC region (see Figure 1), but their products have niche functions in innate immunity. Their indirect involvement in adaptive immunity means they experience different selective pressures. They exhibit lower polymorphism and $d N / d S < 1$ , reflecting the fact that they have not been subject to the same pathogen-mediated balancing selection. The Bayes factors for all three of these genes show strong evidence against TSP, as expected. However, since there are fewer alleles available for the non-classical genes, we note that our method may be conservative here. Interestingly, despite its non-classical role, MHC-E has a known balanced polymorphism in humans; the two main alleles are at similar frequencies worldwide but may have different expression levels and peptide preferences (Paganini et al., 2019; Grant et al., 2020). Our approach—meant to detect ancient TSP—does not reveal balancing selection in MHC-E, showing that this balanced polymorphism is young. For MHC-G, there were not enough sequences available to perform many of the tests (at least two from each species group are required). While we do not expect to see evidence of TSP in this gene, sequencing more alleles is necessary to address this.

Each Class II MHC molecule has an α and β component which are encoded by an A and B gene, respectively. Bayes factors for the Class IIA genes are shown in the top row of Figure 4, while those for their Class IIB partners are shown in the bottom row. The non-classical MHC-DM and -DO molecules assist the classical Class II genes with peptide loading and are not believed to be shaped by balancing selection (Heijmans et al., 2020; Neefjes et al., 2011). As expected, we see strong evidence against TSP between humans and all other primate species for these genes (first two columns of Figure 4).

In contrast, we find evidence for deep TSP within the classical Class II genes. MHC-DPA1 shows TSP between human, chimpanzee, and gorilla in exon 2, but not in exon 3. We find that this TSP in MHC-DPA1 is less deep than has been previously suggested with non-Bayesian methods (Otting and Bontrop, 1995), underscoring the importance of this methodology for handling the MHC. Meanwhile, its partner MHC-DPB1 shows strong evidence for TSP between human and orangutan in exon 3 and suggestive evidence in exon 2; our work provides the first evidence of TSP between humans and other apes for this gene (Slierendregt et al., 1995).

The MHC-DR genes behave somewhat differently than the other classical Class II molecules. While the α and β components of all the other molecules engage in exclusive binding, there are many different MHC-DRβ molecules which all bind to the same MHC-DRα. The MHC-DRA gene is conserved across species with little polymorphism, while the MHC-DRB region is highly variable both in gene content and allelic diversity. Consistent with this, Bayes factors for the MHC-DRA gene reveal strong evidence against TSP for all species pairs, while MHC-DRB1 shows strong evidence in favor of TSP between human, chimpanzee, gorilla, OWM, and even NWM in exon 2. However, because the Bayes factors only support TSP between humans and OWM/NWM in MHC-DRB1 in exon 2, but not in exon 3, this could mean alleles are not actually that ancient. We show in our companion paper that individual MHC-DRB genes are short-lived, and only three are truly orthologous between apes and OWM (Fortier and Pritchard, 2025). These pieces of evidence suggest previous work may have overstated the extent of TSP at this locus.

Remarkably, the MHC-DQB1 gene shows definitive evidence for TSP back to at least the ancestor of humans and OWM. While this result has been presented previously, we confirm it with decisive evidence (Bayes factor >100) for allelic lineages being maintained for over 31 MY (Otting et al., 1992; Otting et al., 2002; Loisel et al., 2006; Simons et al., 2017).

Many of our BEAST trees also showed intermingling of sequences from different species within the OWM or NWM (e.g. Figure 2—figure supplement 4), even if they did not form trans-species clades with ape sequences. This could indicate trans-species polymorphism within the OWM or NWM that is still ancient, but not old enough to be shared with the apes. Therefore, we also calculated Bayes factors across different clades of OWM and NWM (Figure 3—figure supplement 13, Figure 3—figure supplement 14, Figure 4—figure supplement 11, and Figure 4—figure supplement 12). We see very strong evidence for TSP between all groups of OWM for MHC-DPA1, -DPB1, -DQA1, and -DQB1, indicating that allelic lineages at these genes have been maintained within the OWM for at least 19 MY. Unexpectedly, we also see evidence for TSP in some non-classical genes. MHC-E, a gene that is non-classical in humans and is presumed non-classical in OWM (yet is duplicated in some species), shows evidence for 15-MY-old TSP within the OWM. In non-classical MHC-DMB, we also observe TSP within the OWM as old as 11 MY. This could indicate differing roles for these genes in the OWM lineage, and functional experiments are needed to explore this. Due to the uncertainty of locus assignments for alleles of the OWM MHC-A, -B, and -DRB genes and of the NWM genes, we cannot make definitive conclusions about TSP within these clades for these other genes.

In summary, the phylogenetic analyses point to ancient TSP in classical genes MHC-B, -C, -DPA1, -DPB1, -DQB1, -DRB1, and -DRB3. Bayes factors for the non-classical genes MHC-E, -F, -G, -DMA, -DMB, -DOA, and -DOB do not indicate TSP involving apes at these loci—as expected for non-classical genes. However, we detected possible TSP at some of these genes within other clades, such as the OWM, hinting at possible functional differences. Overall, TSP is more ancient among the Class II genes than the Class I genes, consistent with the genes’ older age.

From evolution to function

Alongside the evidence for ancient TSP, the MHC region is also notable for its high rate of missense substitutions ( $d N / d S > 1$ ) (Hughes and Nei, 1988; Hughes and Nei, 1989) and its large number of GWAS hits for autoimmune and infectious diseases (Buniello et al., 2019; Kennedy et al., 2017). We next aimed to understand how these observations relate to signals of TSP and known features of the MHC proteins.

To explore these questions, we first estimated the per-site evolutionary rates within each gene. As in our TSP analysis, we used the BEAST2 package SubstBMA, which estimates evolutionary rates at every site concurrently with a tree. We averaged these rates over all states in the chain to get per-site evolutionary rates, then calculated their fold change relative to the average rate among mostly-gap sites in the alignment (‘baseline’; see Materials and methods; Rapidly-evolving sites).

Figure 5A shows the substitution rate fold change for each nucleotide along the concatenated coding sequence of Class I genes MHC-B, -C, and -E (see Figure 5—figure supplement 1 for the other genes). In classical genes MHC-B and -C, nearly all the rapidly-evolving sites lie within exons 2 and 3, which encode the protein’s peptide-binding domain. While exons 2 and 3 make up only ∼50% of the gene’s length, they contain 94% and 90% of the sites evolving at more than four times the baseline evolutionary rate for the classical genes MHC-B and MHC-C, respectively. In MHC-B, exons 2 and 3 each show significantly higher proportions of rapidly-evolving sites compared to the ‘other’ exons (exons in the gene excluding 2, 3, or 4), while the difference is not significant for MHC-C (Figure 5—figure supplement 2). This result could reflect the relatively young age of MHC-C or its additional role as the dominant Class I molecule for interacting with KIRs (Adams and Parham, 2001; Guethlein et al., 2015; Vollmers et al., 2021; Piontkivska, 2003).

Figure 5 with 5 supplements see all

Download asset Open asset

Rapidly-evolving sites in the Class I genes.

(A) Rapidly-evolving sites are primarily located in exons 2 and 3. Here, the exons are concatenated such that the cumulative position along the coding region is on the x-axis. The dashed orange lines denote exon boundaries. The three genes are aligned such that the same vertical position indicates an evolutionarily equivalent site. The y-axis shows the substitution rate at each site, expressed as a fold change (the base-2 logarithm of each site’s evolutionary rate divided by the mean rate among mostly-gap sites in each alignment; see Materials and methods). (B) Rapidly-evolving sites are located in each protein’s peptide-binding pocket. Structures are Protein Data Bank (Berman et al., 2000) 4BCE (Teze et al., 2014) for HLA-B, 4NT6 (Choo et al., 2014) for HLA-C, and 7P4B (Walters et al., 2022) for HLA-E, with images created in *PyMOL* (Schrödinger, LLC, 2021). Substitution rates for each amino acid are computed as the mean substitution rate of the three sites composing the codon. Orange indicates rapidly-evolving amino acids, while teal indicates conserved amino acids. (C) Rapidly-evolving amino acids are significantly closer to the peptide than conserved amino acids. The y-axis shows the *BEAST2* substitution rate and the x axis shows the minimum distance to the bound peptide, measured in *PyMOL* (Schrödinger, LLC, 2021). Each point is an amino acid, and distances are averaged over several structures (see Table 5). The orange line is a linear regression of substitution rate on minimum distance, with slope and p-value annotated on each panel.

In contrast to these classical genes, non-classical MHC-E primarily presents self-peptides for recognition by NK cell receptors, and its peptide-binding groove is tailored to accommodate a very specific set of self-peptides—leader peptides cleaved from other Class I MHC proteins during processing (Miller et al., 2003). As shown in Figure 5A, this gene has fewer rapidly-evolving sites than the classical genes. These sites are also relatively evenly distributed across the gene, with exons 2 and 3 (which cover ∼50% of the gene’s length) containing 45% of the sites evolving at over four times the baseline evolutionary rate. Interestingly, exon 4—a non-peptide-binding exon of equal size—displays a significantly lower proportion of rapidly-evolving sites compared with the ‘other’ exons (Figure 5—figure supplement 2). These results support that MHC-E has been remarkably conserved across the primates and that its evolution may not be driven by differential peptide binding (Heijmans et al., 2020).

We then examined where the rapidly-evolving sites lie within the physical protein structures. To do this, we averaged the per-site rates within each codon to get per-amino-acid rates, then mapped these onto the known human protein structures. Unfortunately, there are few non-human primate protein structures in the Protein Data Bank (Berman et al., 2000), but the macaque structures we found were nearly identical to those of human. Figure 5B shows structures for human HLA-B, -C, and -E; this view features the peptide (black) sitting in the peptide-binding groove (flanked on top and bottom by helices) (see Figure 5—figure supplement 3 for the rest of the Class I proteins). In MHC-B and -C, rapidly-evolving amino acids (orange) tend to be located within the peptide-binding groove. To quantify this, we measured the minimum distance between each amino acid and the bound peptide. We averaged these distances over several structures, which are listed in the Materials and methods (Table 5). For all three proteins, amino acids closer to the peptide have significantly higher evolutionary rates than amino acids further from the peptide, as shown in Figure 5C (see also Figure 5—figure supplements 4 and 5). The effect is much less pronounced in non-classical MHC-E, where even the amino acids closest to the peptide do not exhibit high evolutionary rates. These results are consistent with the expectation that rapid evolution and diversity at the classical MHC genes would be mediated by selective pressures for changes in peptide binding.

The rapidly-evolving sites for the Class II genes are shown in Figure 6. Panel A shows the substitution rate fold change for each nucleotide along the concatenated coding sequence of the Class II MHC-DRA, -DQA, -DRB, and -DQB gene groups (see Figure 6—figure supplements 1 and 3 for the other genes). Both MHC-DQA and -DQB are extraordinarily polymorphic, but MHC-DRA is conserved compared to its multiple, highly-variable MHC-DRB partners. Indeed, Figure 6A shows that rapidly-evolving sites are concentrated in binding-site-encoding exon 2 for MHC-DRB, -DQB, and to a lesser extent MHC-DQA. Exon 2, which makes up ∼30% of the coding region, contains 32% of sites evolving at more than twice the baseline rate in MHC-DRA, but 57% of such sites in MHC-DQA, 61% in MHC-DQB, and 73% in MHC-DRB. Comparing across exons, exon 2 contains a significantly higher proportion of rapidly-evolving sites compared to the "other" exons in classical MHC-DQA, -DQB, -DRB, and -DPB, but also—curiously—in non-classical MHC-DMA and -DMB (Figure 6—figure supplements 2 and 4). It is interesting that MHC-DM appears to be evolving rapidly in its binding-site-encoding exons, despite the fact that it is not thought to bind peptides. Instead, it is responsible for assisting with peptide loading onto the classical genes. Co-evolution with MHC-DR in particular seems possible; the interaction between the MHC-DM and -DR molecules depends on the affinity of the peptide trying to bind with MHC-DR. MHC-DM thus shapes the repertoire of peptides presented by MHC-DR, favoring high-affinity peptides (Dijkstra and Yamaguchi, 2019; Schulze and Wucherpfennig, 2012). It is plausible that the host-pathogen evolution shaping the MHC-DRB genes has resulted in co-evolution of MHC-DMA and -DMB to maintain this regulatory interaction.

Figure 6 with 7 supplements see all

Download asset Open asset

Rapidly-evolving sites in the Class II genes.

(A) Rapidly-evolving sites are primarily located in exon 2. Here, the exons are concatenated such that the cumulative position along the coding region is on the x-axis. The dashed orange lines denote exon boundaries. The α genes (top two plots) are aligned such that the same vertical position indicates an evolutionarily equivalent site; the same is true for the β genes (bottom two plots). The y-axis shows the substitution rate at each site, expressed as a fold change (the base-2 logarithm of each site’s evolutionary rate divided by the mean rate among mostly-gap sites in each alignment; see Materials and methods). (B) Rapidly-evolving sites are located in each protein’s peptide-binding pocket. Structures are Protein Data Bank (Berman et al., 2000) 5JLZ (Gerstner et al., 2016) for HLA-DR and 2NNA (Henderson et al., 2007) for HLA-DQ, with images created in *PyMOL* (Schrödinger, LLC, 2021). Substitution rates for each amino acid are computed as the mean substitution rate of the three sites composing the codon. Orange indicates rapidly-evolving amino acids, while teal indicates conserved amino acids. (C) Rapidly-evolving amino acids are significantly closer to the peptide than conserved amino acids. The y-axis shows the *BEAST2* substitution rate and the x axis shows the minimum distance to the bound peptide, measured in *PyMOL* (Schrödinger, LLC, 2021). Each point is an amino acid, and distances are averaged over several structures (see Table 5). The orange line is a linear regression of substitution rate on minimum distance, with slope and p-value annotated on each panel.

We again mapped the evolutionary rates onto human protein structures, shown in Figure 6B. In each molecule, the α chain is positioned at the top and encompasses the upper helix forming the binding site, while the β chain is oriented toward the bottom and encompasses the lower helix. The peptide is shown in black. Rapidly-evolving sites are concentrated in each protein’s binding site, although in MHC-DR this is more prominent in the bottom helix (MHC-DRB) (see Figure 6—figure supplement 5 for the other proteins).

We then measured the distance between each amino acid and the bound peptide, shown in Figure 6C. MHC-DRA did not show a significant relationship between evolutionary rate and distance, as expected by its relatively uniform distribution of evolutionary rates across the sequence (Figure 6A). For the other three proteins, amino acids closer to the peptide had significantly higher evolutionary rates. This held true for the classical MHC-DPA and -DPB genes as well (Figure 5—figure supplement 4 and Figure 6—figure supplement 6). Again, this is consistent with differential peptide binding and TCR responsiveness driving the diversity and long-term balancing selection at the classical genes. We could not measure peptide distances for non-classical MHC-DOA, -DOB, -DMA, and -DMB genes because they do not engage in peptide presentation.

Lastly, since the rapidly-evolving sites are likely involved in peptide binding, they also influence the response to pathogens and self-antigens, presumably affecting risk for infectious and autoimmune diseases. To bridge the gap between evolution and complex traits, we collected HLA fine-mapping studies for infectious, autoimmune, and other diseases, as well as for biomarkers and TCR phenotypes. These studies report associations between a disease or trait and classical HLA alleles, SNPs, and amino acid variants, often with multiple independent hits per gene. They demonstrate that HLA variation affects disease in complex ways—sometimes, a single variant is strongly associated with a condition, while other times, a combination of amino acids or even an entire allele (haplotype) is the strongest indicator of disease susceptibility or protection.

We were interested in whether our rapidly-evolving amino acids from the BEAST2 analysis corresponded with disease-associated amino acids from the literature. Table 1 lists disease, trait, and TCR-phenotype associations for the most rapidly-evolving amino acids (fold change ≥ 1) of the MHC-B group (see Table 1—source data 1 for the other genes). The majority of rapidly-evolving positions in MHC-B have at least one association. Furthermore, all three classical Class I genes (MHC-A, -B, and -C) show a significant positive relationship between per-amino-acid evolutionary rate and the number of amino acid associations (Figure 6—figure supplement 7). Interestingly, this relationship is not significant for the Class II genes, possibly because they evolve more slowly overall.

Table 1

Rapidly-evolving amino acids in MHC-B and their trait and disease associations.

Shown here are all amino acid positions in the MHC-B group evolving at more than twice the baseline rate (fold change ≥ 1). Many corresponding positions in human HLA-B have associations with autoimmune or infectious diseases, biomarkers, or TCR phenotypes. Disease associations were collected from a literature search of HLA fine-mapping studies with over 1000 cases (see Materials and methods).

Amino Acid Position	Evol. Rate Fold Change	Distance to Peptide (Å)	Associations
156	3.42	3.55	Chronic Hepatitis C (Hirata et al., 2019), HIV Set Point Viral Load (Luo et al., 2021), Asthma (Sakaue et al., 2021), Eosinophil Count (Sakaue et al., 2021), Hypothyroidism (Sakaue et al., 2021), Pediatric Asthma (Sakaue et al., 2021), Systolic Blood Pressure (Sakaue et al., 2021), Total Protein (Sakaue et al., 2021), TCR β Interaction Probability >50% (Sharon et al., 2016), Plasma Protein Levels of ADAM8, AGER, ASPSCR1, B2M, CCL16, CCL28, CCL4, CD200R1, CD5L, CDSN, CX3CL1, FCRL5, IGF2R, IL12A, IL12B, IL5RA, MICB, NUCB2, PDCD1, RARRES2, SFTPD, SIGLEC6, SNX2, TIMD4, TNFRSF4, TNR, TYRP1 (Krishna et al., 2024)
95	3.10	3.82	KLRF1 Plasma Protein Level (Krishna et al., 2024)
114	3.08	4.80	Rheumatoid Arthritis (Sakaue et al., 2021), Plasma Protein Levels of AIF1, CD1C, DDR1, IL15, LILRB2, MICB (Krishna et al., 2024)
116	2.84	3.44	Eosinophil Count (Hirata et al., 2019), HIV Control (McLaren et al., 2012), Angina (Sakaue et al., 2021), Allergic Rhinitis (Waage et al., 2018), Psoriasis (Zhou et al., 2016), Plasma Protein Levels of ADAM15, APOM, BTN2A1, CD1C, CFB, CXCL11, CXCL9, FLT4, GNLY, KLRF1, LILRB1, MICB, PLXDC2, TNF, TNFRSF13C, TNXB (Krishna et al., 2024)
70	2.64	3.71	Platelet (Sakaue et al., 2021), Plasma Protein Levels of CD8A, GZMA, MICB, NRP2 (Krishna et al., 2024)
97	2.37	4.29	Ankylosing Spondylitis (Butler-Laporte et al., 2023), HIV Set Point Viral Load (Luo et al., 2021), HIV Control (McLaren et al., 2012), Adult Height (Sakaue et al., 2021), Alkaline Phosphatase (Sakaue et al., 2021), Body Weight (Sakaue et al., 2021), C-reactive Protein (Sakaue et al., 2021), IgA Nephritis (Sakaue et al., 2021), Mean Arterial Pressure (Sakaue et al., 2021), Mean Corpuscular Hemoglobin (Sakaue et al., 2021), Pneumonia (Tian et al., 2017), Tonsillectomy (Tian et al., 2017), Plasma Protein Levels of ADGRE2, BTN3A2, CCL21, CCL3, CD1C, CD8A, CDSN, CPVL, DXO, EBI3, EFCAB14, HBEGF, HCG22, IL12A, IL12B, LILRB2, LRP1, LRPAP1, LTB, LY75, MANF, MANSC1, MICB, OSCAR, PLA2G10, PRTN3, SIGLEC10, STAB2, TEK, TNFSF13, ZNRD2 (Krishna et al., 2024)
24	2.35	4.68	Hepatic Cancer (Sakaue et al., 2021), Plasma Protein Levels of ADAM15, CXCL10, FCRL6, GZMB, MICB, TNFSF8 (Krishna et al., 2024)
163	2.23	4.13	Lung Cancer (Squamous Cell Carcinoma) (Ferreiro-Iglesias et al., 2018), Alanine Aminotransferase (Sakaue et al., 2021), TCR Expression (TRAV38-1) (Sharon et al., 2016), TCR $α$ Interaction Probability >50% (Sharon et al., 2016), Plasma Protein Levels of DDR1, GZMA, LILRB1, MICB, NPTX1, SEPTIN3, TEK, WFDC2 (Krishna et al., 2024)
67	1.96	3.72	Graves’ Disease (Hirata et al., 2019), HIV Set Point Viral Load (Luo et al., 2021), Asthma (Sakaue et al., 2021), Psoriasis (Stuart et al., 2022), Plasma Protein Levels of AMBP, C2, CD160, CD28, CD48, CFB, FCRL1, FCRL6, FRZB, GP1BB, LILRB1, LTA, LY96, NID1, SIGLEC9, SORT1, THBD, TNFRSF4, TNFSF13B, TNXB, TP53BP1, VCAM1 (Krishna et al., 2024)
152	1.96	3.53	JIA (Oligoarthritis/RF-negative Polyarthritis) (Hinks et al., 2017), Plasma Protein Levels of LTBR, MICB, PLXNA4, RARRES2 (Krishna et al., 2024)
63	1.71	2.94	HIV Control (McLaren et al., 2012), Skin Cancer (Sakaue et al., 2021), B2M Plasma Protein Level (Krishna et al., 2024)
99	1.70	3.00	Plasma Protein Levels of APOM, CRTAM, DXO, IL15, MICB, OSCAR (Krishna et al., 2024)
66	1.58	3.41	TNFSF11 Plasma Protein Level (Krishna et al., 2024)
69	1.56	4.69	Parkinson’s Disease (Naito et al., 2021)
74	1.31	4.30	Chronic Sinusitis (Sakaue et al., 2021)
62	1.29	3.89	PGLYRP1 Plasma Protein Level (Krishna et al., 2024)
138	1.26	9.18
9	1.20	3.36	Primary Biliary Cholangitis (Darlay et al., 2018), Systemic Lupus Erythematosus (Molineros et al., 2019), Rheumatoid Arthritis (Raychaudhuri et al., 2012), Hyperthyroidism (Sakaue et al., 2021), Monocyte Count (Sakaue et al., 2021), Serum Creatinine (Sakaue et al., 2021), Psoriasis (Zhou et al., 2016), Plasma Protein Levels of CD1C, CX3CL1, IL12B, LIPF, LTA, MICB, PDCD1, RGMA, SGSH, SLAMF7, TNFRSF8 (Krishna et al., 2024)
81	1.03	4.04

Table 1—source data 1 Rapidly-evolving amino acids and their trait and disease associations for all studied genes. For each gene group, the rapidly-evolving amino acid positions (substitution rate 𝑙𝑜𝑔2 fold change >1) are listed alongside their disease and trait associations. group: gene group; position: amino acid position; evol_rate: substitution rate 𝑙𝑜𝑔2 fold change; min_dist: minimum distance to peptide (Å); disease_assoc: list of disease, trait, and other associations, with citations.: https://cdn.elifesciences.org/articles/103547/elife-103547-table1-data1-v1.txt
Download elife-103547-table1-data1-v1.txt

Thus, in summary, we find that rapid evolution has primarily targeted amino acids within the peptide-binding region of each gene, and that these specific positions are likely the primary drivers of phenotypic associations at the MHC locus.

Discussion

The MHC region contains the clearest signals of balancing and directional selection in mammalian genomes, including extreme diversity, ancient trans-species polymorphism, and high rates of nonsynonymous evolution between allelic lineages. In humans, MHC/HLA variation is associated with risk for infectious and autoimmune diseases and many other traits, and HLA matching is critical for successful tissue transplantation (Kennedy et al., 2017; Smith et al., 2024; Lee et al., 2007).

Despite its evolutionary and clinical importance, the extreme diversity of the MHC makes it challenging to study, and basic questions about its evolutionary history remain unresolved. While past work has suggested ultra-deep TSP at this locus, in this study, we re-examined the region with modern, comprehensive data and a unified analysis framework. Using Bayesian evolutionary analysis, we report conclusive evidence for long-term TSP in seven classical genes, including between humans and OWM at MHC-DQB1. Thus, remarkably, allelic lineages at this gene have been maintained for at least 31 MY.

Our evidence for TSP at MHC-DQB1 spanning at least 31 MY places it among the most ancient examples of balancing selection known in any species, and almost certainly the oldest in primates. Aside from MHC, the deepest example within primates is at the ABO locus controlling blood type; it exhibits trans-species polymorphism between humans and gibbons, an age of 23 MY (Ségurel et al., 2012). In various chimpanzee species, OAS1, which helps inhibit viral replication, contains alleles up to 13 MY old (Ferguson et al., 2012). TSP between chimpanzee and human includes LAD1, a protein that maintains cell cohesion (6 MY; Teixeira et al., 2015), retroviral transcription factor TRIM5α (4-7 MY in apes and >8 MY in OWM; Cagliani et al., 2010; Newman et al., 2006), and ZC3HAV1, an antiviral protein leading to viral RNA degradation (6 MY; Cagliani et al., 2012), among others (Leffler et al., 2013).

Looking more broadly across the tree of life, ancient trans-species polymorphism occurs widely, albeit rarely. Several of the oldest examples are found in the MHC locus: MHC polymorphisms have been maintained for 35 MY in cetaceans (Xu et al., 2009), 40 MY in herons (Li et al., 2011), 48 MY in mole rats (Kundu and Faulkes, 2007), 70 MY in tree frogs (Zhao et al., 2013), and over 105 MY in salmonid fishes (Kiryu et al., 2005; Grimholt et al., 2015). There are examples in non-MHC loci as well; in cyanobacteria, polymorphism at the HEP island controlling heterocyst function has been maintained for 74 MY (Sano et al., 2018), in plants, S-genes determining self-incompatibility exhibit TSP spanning 36 MY (Ioerger et al., 1990; Igic et al., 2006; Fujii et al., 2016), and in Formica ants, alleles at a supergene underlying colony queen number have been maintained for over 30 MY (Purcell et al., 2021).

Paradoxically, given the extremely long-lived balancing selection acting in these lineages, many authors have also reported strong directional selection at the MHC (Brandt et al., 2018; Nunes et al., 2021; Bhatia et al., 2011). Indeed, within the phylogeny, we find that the most rapidly-evolving codons are substituted at around two- to fourfold the baseline rate, generating ample mutations upon which selection may act. For the classical genes (except MHC-DRA), these rapidly-evolving sites lie within the peptide binding regions of the corresponding proteins, usually very close to the peptide-contact surfaces. This does not hold true for the non-classical genes, supporting the fact that selection at the MHC is mainly driven by the peptide presentation pathway. Non-classical MHC-DMA and -DMB are a surprising exception, showing significantly elevated proportions of rapidly-evolving sites in exon 2 similarly to the classical genes (even though the MHC-DM molecule does not bind peptides). This pattern may be caused by its co-evolution with the classical Class II genes, and more research is needed to address this.

The primary role of classical MHC proteins is to present peptides for T cell recognition; we found that the same rapidly-evolving amino acids are associated with shaping T cell receptor (TCR) repertoires. Moreover, these amino acids are frequently associated with autoimmune and infectious diseases in HLA fine-mapping studies, particularly for Class I.

Taken together, we begin to see a comprehensive picture of the nature of primate MHC evolution. In response to rapidly-changing pathogen pressures, the PBRs of classical MHC proteins evolve to bind changing pathogen antigens and present them to TCRs. Broad lineages of MHC alleles are maintained over tens of millions of years by strong balancing selection, providing defense against a wide variety of different pathogens. Yet within these lineages, alleles turn over quickly in response to new specific threats. This reconciles evidence for TSP, the presence of thousands of alleles, and the existence of rapidly-evolving sites.

MHC molecules must evolve to detect pathogens with both specificity and sensitivity, and distinguishing self from non-self peptides is a challenging task. As MHC proteins evolve, there is an unavoidable flux between infection defense and autoimmune susceptibility. Additionally, many MHC proteins have roles in both innate and adaptive immunity. As a result, rapidly-evolving amino acids are associated with both infections and autoimmune conditions. In the future, disease studies within other primate species could provide insight into the trajectory of MHC evolution and might reveal evolutionary trade-offs. Perhaps balancing selection has kept the same amino acids disease-relevant across the entire primate evolutionary tree, or maybe the rapid turnover of MHC variation means different primate clades will have different disease associations.

One limitation of the data we used is that the vast majority of nonhuman MHC sequences were obtained via Sanger sequencing or next-generation sequencing methods. While highly accurate sequence-wise, these methods are limited by PCR-related technical artifacts such as heteroduplexes and chimeras. Allelic dropout is also a problem, because similar genes may not be amplified uniformly, resulting in alleles or entire genes being missed when the entire region is amplified simultaneously (Cheng et al., 2022). Additionally, MHC allele assignments require sequences from multiple exons, so the phasing of distant variants can make it difficult to assign alleles confidently (Liu, 2021). Many nonhuman MHC regions are also more complex than their human counterparts, containing unknown numbers of recently duplicated paralogs, copy number variants, and structural variants (Cheng et al., 2022; Heijmans et al., 2020). This makes it difficult to assemble MHC sequences spanning the entire region and assign sequences to genes, which can make the inference of TSP challenging. For example, if sequences from two entirely different genes are thought to be from the same gene, one may falsely conclude that they are highly-diverged alleles and suspect TSP. We relied mostly on common alleles from the IPD-MHC/HLA database, which have been confirmed in multiple individuals often from different research groups. This helps reduce the issue of chimeric or heteroduplex alleles being wrongly considered to be highly-diverged alleles. We also did not assess TSP when orthology was too ambiguous, instead only calculating Bayes factors when our trees and other work strongly supported sequences coming from the same gene. Nevertheless, convergent evolution and shared history can still result in ambiguous gene assignments (Dilthey, 2021). Luckily, long-read sequencing of the MHC region has the potential to solve many of these issues. Oxford Nanopore and PacBio HiFi sequencing have already been used to obtain high-quality MHC sequences in humans (Wenger et al., 2019; Jain et al., 2018; Liu, 2021; Bruijnesteijn, 2023), and researchers are beginning to explore their potential in non-model organisms (Cheng et al., 2022). These methods will be instrumental in increasing the number of alleles detected at MHC loci, resolving entire MHC haplotypes (thus facilitating detection of copy number and structural variation), and even detecting epigenetic modifications (Bruijnesteijn, 2023; Cheng et al., 2022; Karl et al., 2023; Viļuma et al., 2017; Fuselli et al., 2018; Maibach et al., 2017).

Although the primate MHC has been of interest to evolutionary biologists for more than 30 years, there is still much to be done to more fully document the evolution of the MHC genes within and between species. Moreover, we still have a limited understanding of how sequence changes map to functional differences among alleles, and how these relate to allele-specific profiles of pathogen protection (and autoimmunity risk). However, functional and computational advances will provide key opportunities for progress on these problems (Radwan et al., 2020; Vizcaíno et al., 2020).

Materials and methods

Data

We downloaded MHC allele nucleotide sequences for all human and nonhuman genes from the IPD Database (updated January 2023) (Barker et al., 2023; Maccari et al., 2017; Maccari et al., 2020). To supplement the alleles available in the database, we also collected nucleotide sequences from NCBI using the Entrez E-utilities with query ‘histocompatibility AND txidX AND alive[prop]’, where X is a taxon of interest.

We wanted to provide ‘zoomed-in’ versions of various subtrees within the multi-gene trees presented in our companion paper (Fortier and Pritchard, 2025) Thus, we included more species and more alleles per species than in the original trees. In each tree, we also included a ‘backbone’ of sequences from the overall multi-gene tree to provide context for each expanded clade (lists of alleles provided as Supplementary file 1).

For Class I, we expanded the following clades: (1) MHC-A-related genes (MHC-A group), (2) MHC-B-related genes (MHC-B group), (3) MHC-C-related genes (MHC-C group), (4) MHC-E-related genes (MHC-E group), (5) MHC-F-related genes (MHC-F group), and (6) MHC-G-related genes (MHC-G group). For Class IIA, we expanded: (1) MHC-DMA-related genes (MHC-DMA group), (2) MHC-DOA-related genes (MHC-DOA group), (3) MHC-DRA-related genes (MHC-DRA group), (4) MHC-DPA-related genes (MHC-DPA group), and (5) MHC-DQA-related genes (MHC-DQA group). For Class IIB, we expanded: (1) MHC-DMB-related genes (MHC-DMB group), (2) MHC-DOB-related genes (MHC-DOB group), (3) MHC-DRB-related genes (MHC-DRB group), (4) MHC-DPB-related genes (MHC-DPB group), and (5) MHC-DQB-related genes (MHC-DQB group). These sets were inclusive of all orthologs and paralogs of a given human gene across all species we included (see our companion paper for more information; Fortier and Pritchard, 2025). For example, the MHC-A group includes human HLA-A and its 1:1 orthologs in the apes, the expanded MHC-A and -AG paralogs of the OWM, chimpanzee-specific Patr-AL, gorilla-specific Gogo/Gobe-OKO, orangutan-specific Poab/Popy-Ap, and pseudogenes MHC-H and -Y. Tables 2–4 provide an overview of which genes from which species were included in each of these named groups. See Supplementary file 1 for lists of all alleles included within each group.

Table 2

Data summary for Class I.

Each row represents a species, and each column represents a gene group. Each cell lists the number of alleles included for each gene represented by that gene group. Bolded entries are ‘backbone’ sequences that are included in every group.

Clade	Species Group	Species	Latin Name	Pref.	MHC-A Group	MHC-B Group	MHC-C Group	MHC-E Group	MHC-F Group	MHC-G Group
Ape	Human	Human	Homo sapiens	Hosa	63 –A, 1 –A, 4 –H, 1 –H, 2 –Y, 1 –B, 1 –L, 1 –C, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K, 1 –V, 1 –W	1 –A, 1 –H, 91 –B, 1 –B, 1 –L, 1 –C, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K, 1 –V, 1 –W	1 –A, 1 –H, 1 –B, 1 –L, 90 –C, 1 –C, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K, 1 –V, 1 –W	1 –A, 1 –H, 1 –B, 1 –L, 1 –C, 15 –E, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K, 1 –V, 1 –W	1 –A, 1 –H, 1 –B, 1 –L, 1 –C, 1 –E, 10 –F, 1 –F, 1 –G, 1 –J, 1 –K, 1 –V, 1 –W	1 –A, 1 –H, 1 –B, 1 –L, 1 –C, 1 –E, 1 –F, 17 –G, 1 –G, 1 –J, 1 –K, 1 –V, 1 –W
	Chimpanzee	Bonobo	Pan paniscus	Papa	10 –A, 1 –H	26 –B	11 –C	1 –E	1 –F	2 –G
	Chimpanzee	Chimpanzee	Pan troglodytes	Patr	30 –A, 3 –A/AL/OKO, 1 –H	48 –B	28 –C	2 –E	3 –F	1 –G
	Gorilla	Eastern gorilla	Gorilla beringei	Gobe	1 –A/AL/OKO	1 –B	1 –C
	Gorilla	Western gorilla	Gorilla gorilla	Gogo	4 –A, 4 –A/AL/OKO, 1 –H, 3 –Y	10 –B, 3 –B	8 –C	2 –E	3 –F	2 –G
	Orangutan	Sumatran orangutan	Pongo abelii	Poab	5 –A/AL/OKO, 1 –H	12 –B	2 –C	1 –E	1 –F	1 –G
	Orangutan	Bornean orangutan	Pongo pygmaeus	Popy	12 –A/AL/OKO, 1 –H, 7 -Ap	20 –B	5 –C	1 –E	1 –F
	Gibbon	Lar gibbon	Hylobates lar	Hyla	2 –A	1 –B
		Silvery gibbon	Hylobates moloch	Hymo	1 –A	1 unknown		1 –E	1 –F
		Northern white-cheeked gibbon	Nomascus leucogenys	Nole	1 –A	2 unknown		1 –E	1 –F
OWM	Baboon	Olive baboon	Papio anubis	Paan	1 –A, 1 –AG	1 –B		1 –E	3 –F	1 –AG
		Hamadryas baboon	Papio hamadryas	Paha		2 –B
		Yellow baboon	Papio cynocephalus	Pacy				1 –E
	Gelada	Gelada	Theropithecus gelada	Thge	1 –A, 1 –AG			1 –E	1 –F	1 –G
	Mangabey	Sooty mangabey	Cercocebus atys	Ceat	3 –A, 1 –AG	1 –B, 1 –I		5 –E	5 –F
	Drill	Drill	Mandrillus leucophaeus	Male				1 –E	1 –F	2 –G
	Macaque	Crab-eating macaque	Macaca fascicularis	Mafa	1 –L, 1 –V, 1 –W, 1 –A8	1 –B, 1 –L, 1 –V, 1 –W	1 –L, 1 –V, 1 –W	1 –L, 17 –E, 1 –V, 1 –W	1 –L, 24 –F, 1 –V, 1 –W	1 –L, 9 –G, 1 –V, 1 –W
		Rhesus macaque	Macaca mulatta	Mamu	8 –A, 1 –A, 5 –AG, 1 –AG, 1 –B, 1 –I, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K	1 –A, 1 –AG, 9 –B, 1 –B, 1 –I, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K	1 –A, 1 –AG, 1 –B, 1 –I, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K	1 –A, 1 –AG, 1 –B, 1 –I, 30 –E, 1 –E, 1 –F, 1 –G, 1 –J, 1 –K	1 –A, 1 –AG, 1 –B, 1 –I, 1 –E, 18–F, 1 –F, 1 –G, 1 –J, 1 –K	1 –A, 5 –AG, 1 –AG, 1 –B, 1 –I, 1 –E, 1 –F, 4 –G, 1 –G, 1 –J, 1 –K
		Stump-tailed macaque	Macaca arctoides	Maar		1 –B
		Assam macaque	Macaca assamensis	Maas		1 –B
		Northern pig-tailed macaque	Macaca leonina	Malo		1 –B
		Southern pig-tailed macaque	Macaca nemestrina	Mane		2 –B		10 –E	7 –F
		Tibetan macaque	Macaca thibetana	Math		1 –B		1 –E	1 –F	1 –G
	Grivet	Grivet	Chlorocebus aethiops	Chae						2 –G
	Vervet Monkey	Vervet monkey	Chlorocebus pygerythrus	Chpy		2 –B
	Green Monkey	Green monkey	Chlorocebus sabaeus	Chsa	1 –A, 1 –AG, 1 –A8	1 –B		5 –E	1 –F	1 –AG
	Guenon	Blue monkey	Cercopithecus mitis	Cemi		2 –B
	Colobus	Angola colobus	Colobus angolensis	Coan	1 –AG			2 –E	1 –F
	Colobus	Ugandan red colobus	Piliocolobus tephrosceles	Pite				1 –E	1 –F	1 –G
	Langur	Francois’ langur	Trachypithecus francoisi	Trfr	1 –A, 1 –AG			1 –E	1 –F	1 –G
	Snub-Nosed Monkey	Golden snub-nosed monkey	Rhinopithecus roxellana	Rhro	1 –A, 1 –AG			1 –E		1 –G
	Snub-Nosed Monkey	Black-and-white snub-nosed monkey	Rhinopithecus bieti	Rhbi				1 –E
NWM	Tamarin	Cotton-top tamarin	Saguinus oedipus	Saoe				1 –E	4 –F	9 –G, 1 –PS, 3 –N1/3/4
		Brown-mantled tamarin	Leontocebus fuscicollis	Lefu						5 –G
		Golden lion tamarin	Leontopithecus rosalia	Lero						2 –G
		White-lipped tamarin	Saguinus labiatus	Sala						9 –G
	Marmoset	Common marmoset	Callithrix jacchus	Caja	1 –B, 1 –E, 1 –F, 1 –G	8 –B, 1 –B, 1 –E, 1 –F, 1 –G	1 –B, 1 –E, 1 –F, 1 –G	1 –B, 2 –E, 1 –E, 1 –F, 1 –G	1 –B, 1 –E, 17 –F, 1 –F, 1 –G	1 –B, 1 –E, 1 –F, 76 –G, 1 –G, 1 –PS
	Night Monkey	Three-striped night monkey	Aotus trivirgatus	Aotr				1 –E		3 –G, 1 –PS
		Gray-bellied night monkey	Aotus lemurinus	Aole					5 –F
		Nancy Ma’s night monkey	Aotus nancymaae	Aona					3 –F	1 –B, 7 –G
	Capuchin	Panamanian white-faced capuchin	Cebus imitator	Ceim				1 –E	1 –F	6 –G, 1 unknown
	Capuchin	Tufted capuchin	Sapajus apella	Saap				1 –E	1 –F	4 –G, 2 unknown
	Squirrel Monkey	Black-capped squirrel monkey	Saimiri boliviensis	Sabo				3 –E	2 –F	1 –B, 3 –G, 1 unknown
	Squirrel Monkey	Common squirrel monkey	Saimiri sciureus	Sasc						1 –G
	Spider Monkey	White-bellied spider monkey	Ateles belzebuth	Atbe		1 –B		1 –E		3 –G
	Spider Monkey	Black-headed spider monkey	Ateles fusciceps	Atfu		1 –B		2 –E		9 –G
	Saki	White-faced saki	Pithecia pithecia	Pipi		1 –B		1 –E		4 –G
Tarsier	Tarsier	Philippine tarsier	Carlito syrichta	Casy	1 unknown	1 unknown	1 unknown	1 unknown	1 unknown	1 unknown
Strepsirrhini	Lemur	Ring-tailed lemur	Lemur catta	Leca	1 unknown	1 unknown	1 unknown	1 unknown	1 unknown	1 unknown

Table 3

Data summary for Class IIA.

Clade	Species Group	Species	Latin Name	Pref.	MHC-DPA Group	MHC-DQA Group	MHC-DRA Group	MHC-DMA Group	MHC-DOA Group
Ape	Human	Human	Homo sapiens	Hosa	22 –DPA, 2 –DPA, 2 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	2 –DPA, 22 –DQA, 2 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	2 –DPA, 2 –DQA, 4–DRA, 1 –DRA, 1 –DMA, 1 –DOA	2 –DPA, 2 –DQA, 1 –DRA, 8–DMA, 1 –DMA, 1 –DOA	2 –DPA, 2 –DQA, 1 –DRA, 1 –DMA, 14–DOA, 1–DOA
	Chimpanzee	Bonobo	Pan paniscus	Papa	1 –DPA	2 –DQA	1 –DRA	1 –DMA	1 –DOA
	Chimpanzee	Chimpanzee	Pan troglodytes	Patr	5 –DPA	6 –DQA	3 –DRA	1 –DMA	1 –DOA
	Gorilla	Western gorilla	Gorilla gorilla	Gogo	3 –DPA	10 –DQA	1 –DRA	1 –DMA	1 –DOA
	Orangutan	Sumatran orangutan	Pongo abelii	Poab	4 –DPA	6 –DQA	4 –DRA	1 –DMA	1 –DOA
	Orangutan	Bornean orangutan	Pongo pygmaeus	Popy	4 –DPA	4 –DQA	2 –DRA
	Gibbon	Silvery gibbon	Hylobates moloch	Hymo	1 –DPA	3 –DQA	1 –DRA	1 –DMA	1 –DOA
		Northern white-cheeked gibbon	Nomascus leucogenys	Nole	1 –DPA	2 –DQA	1 –DRA	1 –DMA	1 –DOA
		Lar gibbon	Hylobates lar	Hyla		6 –DQA
OWM	Baboon	Olive baboon	Papio anubis	Paan	13 –DPA	8 –DQA	3 –DRA	1 –DMA	1 –DOA
		Hamadryas baboon	Papio hamadryas	Paha	1 –DPA	3 –DQA
		Yellow baboon	Papio cynocephalus	Pacy		7 –DQA
		Guinea baboon	Papio papio	Papp		4 –DQA
	Gelada	Gelada	Theropithecus gelada	Thge	2 –DPA	3 –DQA	1 –DRA	1 –DMA	1 –DOA
	Mangabey	Sooty mangabey	Cercocebus atys	Ceat	2 –DPA	2 –DQA	1 –DRA	1 –DMA	1 –DOA
	Mangabey	Black crested mangabey	Lophocebus aterrimus	Loat		1 –DQA
	Drill	Drill	Mandrillus leucophaeus	Male	1 –DPA	2 –DQA	1 –DRA	1 –DMA	1 –DOA
	Macaque	Crab-eating macaque	Macaca fascicularis	Mafa	30 –DPA, 1 –DMA, 1 –DOA	11 –DQA, 1 –DMA, 1 –DOA	16 –DRA, 1 –DMA, 1 –DOA	7 –DMA, 1 –DMA, 1 –DOA	1 –DMA, 6 –DOA, 1 –DOA
		Northern pig-tailed macaque	Macaca leonina	Malo	6 –DPA	8 –DQA	5 –DRA	2 –DMA	7 –DOA
		Rhesus macaque	Macaca mulatta	Mamu	22 –DPA, 1 –DPA, 2 –DQA, 1 –DRA	1 –DPA, 9 –DQA, 2 –DQA, 1 –DRA	1 –DPA, 2 –DQA, 12 –DRA, 1 –DRA	1 –DPA, 2 –DQA, 1 –DRA, 4 –DMA	1 –DPA, 2 –DQA, 1 –DRA, 1 –DOA
		Southern pig-tailed macaque	Macaca nemestrina	Mane	14 –DPA	10 –DQA	11 –DRA	1 –DMA	1 –DOA
		Tibetan macaque	Macaca thibetana	Math	7 –DPA	1 –DQA	1 –DRA	10 –DMA
		Stump-tailed macaque	Macaca arctoides	Maar	1 –DPA	2 –DQA
	Grivet	Grivet	Chlorocebus aethiops	Chae		6 –DQA
	Green Monkey	Green monkey	Chlorocebus sabaeus	Chsa	5 –DPA	2 –DQA	1 –DRA		1 –DOA
	Guenon	Blue monkey	Cercopithecus mitis	Cemi		5 –DQA
	Guenon	De Brazza’s monkey	Cercopithecus neglectus	Cene		2 –DQA
	Colobus	Angola colobus	Colobus angolensis	Coan		2 –DQA	1 –DRA	1 –DMA	1 –DOA
		Ugandan red colobus	Piliocolobus tephrosceles	Pite	1 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
		Mantled guereza	Colobus guereza	Cogu		1 –DQA
	Langur	Francois’ langur	Trachypithecus francoisi	Trfr	1 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
	Snub-Nosed Monkey	Black-and-white snub-nosed monkey	Rhinopithecus bieti	Rhbi			1 –DRA	1 –DMA	1 –DOA
	Snub-Nosed Monkey	Golden snub-nosed monkey	Rhinopithecus roxellana	Rhro	1 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
NWM	Tamarin	Cotton-top tamarin	Saguinus oedipus	Saoe		4 –DQA
	Marmoset	Common marmoset	Callithrix jacchus	Caja	1 –DPA, 2 –DQA, 1 –DMA, 1 –DOA	1 –DPA, 6 –DQA, 2 –DQA, 1 –DMA, 1 –DOA	1 –DPA, 2 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DPA, 2 –DQA, 1 –DMA, 1 –DOA	1 –DPA, 2 –DQA, 1 –DMA, 1 –DOA
	Night Monkey	Nancy Ma’s night monkey	Aotus nancymaae	Aona	1 –DPA, 1 –DRA	6 –DQA, 1 –DRA	2 –DRA, 1 –DRA	1 –DRA, 1 –DMA	1 –DRA, 1 –DOA
		Gray-bellied night monkey	Aotus lemurinus	Aole		3 –DQA
		Spix’s night monkey	Aotus vociferans	Aovo			3 –DRA
	Capuchin	Panamanian white-faced capuchin	Cebus imitator	Ceim	1 –DPA	3 –DQA	1 –DRA	1 –DMA	1 –DOA
	Capuchin	Tufted capuchin	Sapajus apella	Saap	1 –DPA	3 –DQA	1 –DRA	1 –DMA	1 –DOA
	Squirrel Monkey	Black-capped squirrel monkey	Saimiri boliviensis	Sabo		2 –DQA	1 –DRA	1 –DMA	1 –DOA
	Squirrel Monkey	Common squirrel monkey	Saimiri sciureus	Sasc	3 –DPA
Tarsier	Tarsier	Philippine tarsier	Carlito syrichta	Casy	3 –DPA	2 –DQA	1 –DRA	1 –DMA	1 –DOA
Strepsirrhini	Lemur	Ring-tailed lemur	Lemur catta	Leca	1 –DPA, 1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DPA, 1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DPA, 1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DPA, 1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DPA, 1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA
	Lemur	Gray mouse lemur	Microcebus murinus	Mimu	1 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
	Loris	Sunda slow loris	Nycticebus coucang	Nyco	1 –DPA		2 –DRA	1 –DMA	1 –DOA
	Galago	Northern greater galago	Otolemur garnettii	Otga	1 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
	Sifaka	Coquerel’s sifaka	Propithecus coquereli	Prco	2 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
Flying Lemur	Flying Lemur	Sunda flying lemur	Galeopterus variegatus	Gava	2 –DPA		1 –DRA	1 –DMA	1 –DOA
Tree Shrew	Tree Shrew	Chinese tree shrew	Tupaia chinensis	Tuch	4 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
Glires	Rodent	Groundhog	Marmota monax	Mamo	1 –DPA, 1 –DPA	1 –DPA	1 –DPA, 1 –DRA	1 –DPA, 1 –DMA	1 –DPA, 1 –DOA
	Rodent	Brown rat	Rattus norvegicus	Rano	1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	2 –DQA, 1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DQA, 2 –DRA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DQA, 1 –DRA, 1 –DMA, 1–DOA, 1–DOA
	Pika	Plateau pika	Ochotona curzoniae	Occu	2 –DPA	2 –DQA	1 –DRA	1 –DMA	1 –DOA
	Pika	American pika	Ochotona princeps	Ocpr	2 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
Laurasiatheria	Artiodactyla	Bactrian camel	Camelus bactrianus	Caba	1 –DPA	1 –DQA		1 –DMA	1 –DOA
		Wild boar	Sus scrofa	SLA	1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DQA, 1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DQA, 3 –DRA, 1 –DRA, 1 –DMA, 1 –DOA	1 –DQA, 1 –DRA, 4 –DMA, 1 –DMA, 1 –DOA	1 –DQA, 1 –DRA, 1 –DMA, 1 –DOA, 1 –DOA
		Even-toed ungulates	Bos sp.	BoLA		2 –DQA
		Domestic yak	Bos grunniens	Bogr		2 –DQA
		Water buffalo	Bubalus bubalis	Bubu		2 –DQA
		Sheep	Ovis aries	Ovar		4 –DQA	3 –DRA
		Dromedary camel	Camelus dromedarius	Cadr			1 –DRA
		Bighorn sheep	Ovis canadensis	Ovca			3 –DRA
	Ferungulata	Sea otter	Enhydra lutris	Enlu		1 –DQA	1 –DRA	1 –DMA	1 –DOA
		Cat	Felis catus	Feca			1 –DRA	1 –DMA	1 –DOA
		Sunda pangolin	Manis javanica	Maja		1 –DQA	1 –DRA	1 –DMA	1 –DOA
		Cougar	Puma concolor	Puco	1 –DPA			1 –DMA	1 –DOA
		Jaguarundi	Puma yagouaroundi	Puya	1 –DPA		1 –DRA	1 –DMA	1 –DOA
		Steller sea lion	Eumetopias jubatus	Euju	1 –DPA	1 –DQA
		Horse	Equus caballus	Eqca		5 –DQA	3 –DRA
	Bat	Big brown bat	Eptesicus fuscus	Epfu	1 –DPA	2 –DQA	1 –DRA	1 –DMA
		Kuhl’s pipistrelle	Pipistrellus kuhlii	Piku	1 –DPA	1 –DQA	1 –DRA	1 –DMA
		Large flying fox	Pteropus vampyrus	Ptva		2 –DQA	1 –DRA		1 –DOA
	Mole	Star-nosed mole	Condylura cristata	Cocr		3 –DQA			1 –DOA
Atlantogenata	Xenarthra	Linnaeus’s two-toed sloth	Choloepus didactylus	Chdi	1 –DPA	1 –DQA	1 –DRA	1 –DMA	1 –DOA
	Xenarthra	Nine-banded armadillo	Dasypus novemcinctus	Dano	1 –DPA	2 –DQA	1 –DRA	1 –DMA	1 –DOA
	Afrotheria	Cape golden mole	Chrysochloris asiatica	Chas	1 –DPA		1 –DRA	1 –DMA	1 –DOA
		Cape elephant shrew	Elephantulus edwardii	Eled	1 –DPA	1 –DPA, 1 –DQA	1 –DPA, 1 –DRA	1 –DPA, 1 –DMA	1 –DPA, 1 –DOA
		Aardvark	Orycteropus afer	Oraf	1 –DPA	1 –DQA		1 –DMA	1 –DOA
		West Indian manatee	Trichechus manatus	Trma	2 –DPA	1 –DQA	1 –DRA	1 –DMA
		Lesser hedgehog tenrec	Echinops telfairi	Ecte			1 –DRA		1 –DOA

Table 4

Data summary for Class IIB.

Clade	Species Group	Species	Latin Name	Pref.	MHC-DPB Group	MHC-DQB Group	MHC-DRB Group	MHC-DMB Group	MHC-DOB Group
Ape	Human	Human	Homo sapiens	Hosa	74 –DPB, 2 –DPB, 2 –DQB, 9 –DRB, 1 –DMB, 1 –DOB	2 –DPB, 24 –DQB, 2 –DQB, 9 –DRB, 1 –DMB, 1 –DOB	2 –DPB, 2 –DQB, 46 –DRB, 9 –DRB, 1 –DMB, 1 –DOB	2 –DPB, 2 –DQB, 9 –DRB, 6 –DMB, 1 –DMB, 1 –DOB	2 –DPB, 2 –DQB, 9 –DRB, 1 –DMB, 14 –DOB, 1 –DOB
	Chimpanzee	Bonobo	Pan paniscus	Papa	8 –DPB	2 –DQB	5 –DRB	2 –DMB	1 –DOB
	Chimpanzee	Chimpanzee	Pan troglodytes	Patr	6 –DPB	9 –DQB	17 –DRB	2 –DMB	2 –DOB
	Gorilla	Western gorilla	Gorilla gorilla	Gogo	5 –DPB	10 –DQB	7 –DRB	2 –DMB	1 –DOB
	Orangutan	Sumatran orangutan	Pongo abelii	Poab	5 –DPB	6 –DQB	7 –DRB	1 –DMB	1 –DOB
	Orangutan	Bornean orangutan	Pongo pygmaeus	Popy	5 –DPB	3 –DQB	7 –DRB	3 –DMB
	Gibbon	Silvery gibbon	Hylobates moloch	Hymo	1 –DPB	2 –DQB	5 –DRB	1 –DMB	1 –DOB
		Northern white-cheeked gibbon	Nomascus leucogenys	Nole	1 –DPB	2 –DQB		1 –DMB	1 –DOB
		Lar gibbon	Hylobates lar	Hyla		4 –DQB
OWM	Baboon	Olive baboon	Papio anubis	Paan	5 –DPB	5 –DQB	11 –DRB	1 –DMB
		Hamadryas baboon	Papio hamadryas	Paha		3 –DQB	1 –DRB
		Chacma baboon	Papio ursinus	Paur			8 –DRB
	Gelada	Gelada	Theropithecus gelada	Thge	2 –DPB	1 –DQB	3 –DRB	1 –DMB	1 –DOB
	Mangabey	Sooty mangabey	Cercocebus atys	Ceat	3 –DPB	2 –DQB	1 –DRB	1 –DMB	1 –DOB
	Drill	Drill	Mandrillus leucophaeus	Male	1 –DPB		1 –DRB	1 –DMB	1 –DOB
	Mandrill	Mandrill	Mandrillus sphinx	Masp			10 –DRB
	Macaque	Crab-eating macaque	Macaca fascicularis	Mafa	7 –DPB, 2 –DRB, 1 –DMB, 1 –DOB	5 –DQB, 2 –DRB, 1 –DMB, 1 –DOB	6 –DRB, 2 –DRB, 1 –DMB, 1 –DOB	2 –DRB, 4 –DMB, 1 –DMB, 1 –DOB	2 –DRB, 1 –DMB, 6 –DOB, 1 –DOB
		Northern pig-tailed macaque	Macaca leonina	Malo	5 –DPB	5 –DQB	3 –DRB	2 –DMB	3 –DOB
		Rhesus macaque	Macaca mulatta	Mamu	5 –DPB, 2 –DPB, 1 –DQB, 4 –DRB	2 –DPB, 4 –DQB, 1 –DQB, 4 –DRB	2 –DPB, 1 –DQB, 10 –DRB, 4 –DRB	2 –DPB, 1 –DQB, 4 –DRB, 5 –DMB	2 –DPB, 1 –DQB, 4 –DRB, 1 –DOB
		Southern pig-tailed macaque	Macaca nemestrina	Mane	5 –DPB	5 –DQB	6 –DRB	1 –DMB	1 –DOB
		Tibetan macaque	Macaca thibetana	Math	13 –DPB	6 –DQB	1 –DRB	1 –DMB	1 –DOB
		Stump-tailed macaque	Macaca arctoides	Maar		5 –DQB	2 –DRB
		Japanese macaque	Macaca fuscata	Mafu			3 –DRB
		Lion-tailed macaque	Macaca silenus	Masi			3 –DRB
	Grivet	Grivet	Chlorocebus aethiops	Chae		3 –DQB	6 –DRB
	Green Monkey	Green monkey	Chlorocebus sabaeus	Chsa	3 –DPB	4 –DQB	7 –DRB	1 –DMB	1 –DOB
	Colobus	Angola colobus	Colobus angolensis	Coan	2 –DPB	2 –DQB		1 –DMB	1 –DOB
	Colobus	Ugandan red colobus	Piliocolobus tephrosceles	Pite	1 –DPB	1 –DQB	3 –DRB	1 –DMB	1 –DOB
	Langur	Francois’ langur	Trachypithecus francoisi	Trfr	2 –DPB	1 –DQB	3 –DRB	1 –DMB	1 –DOB
	Langur	Gray langur	Semnopithecus entellus	Seen	1 –DPB
	Snub-Nosed Monkey	Black-and-white snub-nosed monkey	Rhinopithecus bieti	Rhbi				1 –DMB	1 –DOB
	Snub-Nosed Monkey	Golden snub-nosed monkey	Rhinopithecus roxellana	Rhro	1 –DPB	1 –DQB	2 –DRB	1 –DMB	1 –DOB
NWM	Tamarin	Cotton-top tamarin	Saguinus oedipus	Saoe	3 –DPB	4 –DQB	8 –DRB
	Tamarin	White-lipped tamarin	Saguinus labiatus	Sala			3 –DRB
	Marmoset	Common marmoset	Callithrix jacchus	Caja	1 –DPB, 1 –DPB, 2 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 3 –DQB, 2 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 2 –DQB, 3 –DRB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 2 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 2 –DQB, 1 –DRB, 1 –DMB, 1 –DOB
	Night Monkey	Nancy Ma’s night monkey	Aotus nancymaae	Aona	4 –DPB	3 –DQB	6 –DRB	1 –DMB	1 –DOB
		Gray-bellied night monkey	Aotus lemurinus	Aole	3 –DPB	1 –DQB
		Azara’s night monkey	Aotus azarae	Aoaz			2 –DRB
		Black-headed night monkey	Aotus nigriceps	Aoni			3 –DRB
		Three-striped night monkey	Aotus trivirgatus	Aotr			3 –DRB
		Spix’s night monkey	Aotus vociferans	Aovo			3 –DRB
	Capuchin	Panamanian white-faced capuchin	Cebus imitator	Ceim	1 –DPB	3 –DQB	1 –DRB	1 –DMB	1 –DOB
	Capuchin	Tufted capuchin	Sapajus apella	Saap	1 –DPB	4 –DQB	3 –DRB	1 –DMB
	Squirrel Monkey	Black-capped squirrel monkey	Saimiri boliviensis	Sabo	1 –DPB	3 –DQB		1 –DMB	1 –DOB
	Squirrel Monkey	Common squirrel monkey	Saimiri sciureus	Sasc			2 –DRB
	Spider Monkey	White-bellied spider monkey	Ateles belzebuth	Atbe			2 –DRB
	Howler Monkey	Guatemalan black howler	Alouatta pitta	Alpi			2 –DRB
	Saki	White-faced saki	Pithecia pithecia	Pipi			3 –DRB
Tarsier	Tarsier	Philippine tarsier	Carlito syrichta	Casy	2 –DPB	1 –DQB	2 –DRB	1 –DMB	1 –DOB
Strepsirrhini	Lemur	Ring-tailed lemur	Lemur catta	Leca	1 –DPB, 1 –DPB, 1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DPB, 1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB
	Lemur	Gray mouse lemur	Microcebus murinus	Mimu	1 –DPB	3 –DQB	2 –DRB	1 –DMB	1 –DOB
	Loris	Sunda slow loris	Nycticebus coucang	Nyco	1 –DPB
	Galago	Northern greater galago	Otolemur garnettii	Otga	1 –DPB	1 –DQB	1 –DRB	1 –DMB	1 –DOB
	Sifaka	Coquerel’s sifaka	Propithecus coquereli	Prco	1 –DPB	1 –DQB		1 –DMB	1 –DOB
Flying Lemur	Flying Lemur	Sunda flying lemur	Galeopterus variegatus	Gava	2 –DPB		2 –DRB	1 –DMB	1 –DOB
Tree Shrew	Tree Shrew	Chinese tree shrew	Tupaia chinensis	Tuch	1 –DPB			1 –DMB	1 –DOB
Glires	Rodent	Groundhog	Marmota monax	Mamo	1 –DPB, 1 –DPB	1 –DPB	1 –DPB	1 –DPB, 1–DMB	1 –DPB
	Rodent	Brown rat	Rattus norvegicus	Rano	1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	2 –DQB, 1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DQB, 3 –DRB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DQB, 1 –DRB, 1 –DMB, 1 –DMB, 1 –DOB	1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB
	Pika	American pika	Ochotona princeps	Ocpr		1 –DQB	1 –DRB	1 –DMB
	Pika	Plateau pika	Ochotona curzoniae	Occu	1 –DPB	2 –DQB			1 –DOB
Laurasiatheria	Artiodactyla	Bactrian camel	Camelus bactrianus	Caba				1 –DMB	1 –DOB
		Dromedary camel	Camelus dromedarius	Cadr	1 –DPB	1 –DPB	1 –DPB	1 –DPB	1 –DPB
		Wild boar	Sus scrofa	SLA	1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	2 –DQB, 1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DQB, 4 –DRB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DQB, 1 –DRB, 1 –DMB, 1 –DOB	1 –DQB, 1 –DRB, 1 –DMB, 2 –DOB, 1 –DOB
		Even-toed ungulates	Bos sp.	BoLA		1 –DQB	2 –DRB
		Water buffalo	Bubalus bubalis	Bubu		1 –DQB
		Wild Bactrian camel	Camelus ferus	Cafe		1 –DQB
		Sheep	Ovis aries	Ovar		6 –DQB	1 –DRB
		Goat	Capra hircus	Cahi			2 –DRB
		Bighorn sheep	Ovis canadensis	Ovca			1 –DRB
	Ferungulata	Horse	Equus caballus	Eqca		6 –DQB	3 –DRB	5 –DMB	3 –DOB
		Sunda pangolin	Manis javanica	Maja				1 –DMB	1 –DOB
		Cougar	Puma concolor	Puco	1 –DPB			1 –DMB
		Jaguarundi	Puma yagouaroundi	Puya				1 –DMB
		Northern elephant seal	Mirounga angustirostris	Mian			1 –DRB		1 –DOB
		Sea otter	Enhydra lutris	Enlu	1 –DPB	1 –DQB
		Steller sea lion	Eumetopias jubatus	Euju		1 –DQB
	Bat	Big brown bat	Eptesicus fuscus	Epfu		2 –DQB	1 –DRB	1 –DMB
		Large flying fox	Pteropus vampyrus	Ptva		1 –DQB			1 –DOB
		Kuhl’s pipistrelle	Pipistrellus kuhlii	Piku	1 –DPB	1 –DQB
	Mole	Star-nosed mole	Condylura cristata	Cocr		2 –DQB	1 –DRB	1 –DMB	1 –DOB
Atlantogenata	Xenarthra	Linnaeus’s two-toed sloth	Choloepus didactylus	Chdi	2 –DPB	1 –DQB		1 –DMB	1 –DOB
	Xenarthra	Nine-banded armadillo	Dasypus novemcinctus	Dano	1 –DPB
	Afrotheria	Aardvark	Orycteropus afer	Oraf		1 –DQB			1 –DOB
		Cape elephant shrew	Elephantulus edwardii	Eled	1 –DPB
		West Indian manatee	Trichechus manatus	Trma	1 –DPB
		Lesser hedgehog tenrec	Echinops telfairi	Ecte			1 –DRB

We aligned each group separately using MUSCLE (Edgar, 2004) with default settings and manually adjusted, following the alignments we already produced for the multi-gene trees in our companion paper (Fortier and Pritchard, 2025).

Nucleotide diversity

Request a detailed protocol

The classical MHC region is defined as chr6:28,510,120–33,480,577 (GRCh38) (Genome Reference Consortium, 2022). Nucleotide diversity (π) was calculated on modern human data from the 1000 Genomes Project (Auton et al., 2015) using VCFtools (0.1.15) (Danecek et al., 2011). For the entire MHC region (Figure 1A), π was calculated in 5000 bp sliding windows with a step size of 1000 bp. For each gene separately (Figure 1B and C, Figure 1—figure supplement 1), π was calculated in 50 bp sliding windows with a step size of 10 bp.

Bayesian phylogenetic analysis

We constructed phylogenetic trees using BEAST2 (Bouckaert et al., 2014; Bouckaert et al., 2019) with package SubstBMA (Wu et al., 2013). SubstBMA implements a spike-and-slab mixture model that simultaneously estimates the phylogenetic tree, the number of site partitions, the assignment of sites to partitions, the nucleotide substitution model, and a rate multiplier for each partition. Since we were chiefly interested in the partitions and their rate multipliers, we used the RDPM model as described by Wu et al., 2013. In the RDPM model, the number of nucleotide substitution model categories is fixed to 1, so that all sites, regardless of rate partition, share the same estimated nucleotide substitution model. This reduces the number of parameters to be estimated and ensures that only evolutionary rates vary across site partitions, reducing overall model complexity. We used an uncorrelated lognormal relaxed molecular clock because, in reality, evolutionary rates vary among branches (Bergeron et al., 2023).

Priors

Request a detailed protocol

For the Dirichlet process priors, we used the informative priors constructed by Wu et al., 2013 for their mammal dataset. This is appropriate because they include several of the same species and their mammals span approximately the same evolutionary time that we consider in our study. We also use their same priors on tree height, base rate distribution, and a Yule process coalescent prior. We did not specify a calibration point—a time-based prior on a node—because we did not expect our sequences to group according to the species tree.

Running BEAST2

Request a detailed protocol

We aligned sequences across genes and species and ran BEAST2 on various subsets of the alignment. For the Class I gene groups (MHC-A group, MHC-B group, MHC-C group, MHC-E group, MHC-F group, and MHC-G group), we repeated the analysis for (1) exon 2 only (PBR), (2) exon 3 only (PBR), (3) exon 4 only (non-PBR), and (4) exons 1, 5, 6, 7, and 8 together (non-PBR; ‘other’ exons). For the Class IIA gene groups (MHC-DMA group, MHC-DOA group, MHC-DRA group, MHC-DPA group, and MHC-DQA group), we used (1) exon 2 only (PBR), (2) exon 3 only (non-PBR), and (3) exons 1, 3, 4, and 5 together (non-PBR; ‘other’ exons). For Class IIB gene groups (MHC-DMB group, MHC-DOB group, MHC-DRB group, MHC-DPB group, and MHC-DQB group), we analyzed (1) exon 2 only (PBR), (2) exon 3 only (non-PBR), and (3) exons 1, 3, 4, and 5 together (non-PBR; ‘other’ exons). In the following, each ‘analysis’ is a collection of BEAST2 runs using one of these sets of exons of a particular gene group.

The XML files we used to run BEAST2 were based closely on those used for the mammal dataset with the RDPM model and uncorrelated relaxed clock in Wu et al., 2013 (https://github.com/jessiewu/substBMA/blob/master/examples/mammal/mammal_rdpm_uc.xml; Vaughan et al., 2018). Running a model with per-site evolutionary rate categories and a relaxed clock means there are many parameters to estimate. Along with the large number of parameters, the highly polymorphic and often highly diverged sequences in our alignments make it difficult for BEAST2 to explore the state space. Thus, we undertook considerable effort to ensure good mixing and convergence of the chains. First, we employed coupled MCMC for all analyses. Coupled MCMC is essentially the same as the regular MCMC used in BEAST2, except that it uses additional ‘heated’ chains with increased acceptance probabilities that can traverse unfavorable intermediate states and allow the main chain to move away from an inferior local optimum (Müller and Bouckaert, 2020). Using coupled MCMC both speeds up BEAST2 runs and improves mixing and convergence. We used four heated chains for each run with a delta temperature of 0.025. Second, we ran each BEAST2 run for 40,000,000 states, discarding the first 4,000,000 states as burn-in and sampling every 10,000 states. Third, we ran at least 8 independent replicates of each analysis. The replicates use the exact same alignment, but explore state space independently and thus are useful for improving the effective sample size of tricky parameters. As recommended by BEAST2, we examined all replicates in Tracer version 1.7.2 (Rambaut et al., 2018) to ensure that they were sampling from the same parameter distributions and had reached convergence. We excluded replicates for which this was not true, as these chains were probably stuck in suboptimal state space. Additionally, Tracer provides estimates of the effective sample size (ESS) for the combined set of states from all chosen replicates, and we required that the ESS be larger than 100 for all parameters. If there were fewer than 4 acceptable replicates or if the ESS was below 100 for any parameter, we re-ran more independent replicates of the analysis until these requirements were satisfied. We obtained between 5 and 18 acceptable replicates per analysis (median 8).

For some analyses, computational limitations prevented BEAST2 from being able to reach 40,000,000 states. In these situations, more replicates (of fewer states) were usually required to achieve good mixing and convergence. The first 4,000,000 states from each run were still discarded as burn-in even though this represented more than 10% of states in these cases.

This stringent procedure ensured that all of the replicates were exploring the same parameter space and were converging upon the same global optimum, allowing the ≥4 independent runs to be justifiably combined. We combined the acceptable replicates using LogCombiner version 2.6.7 (Drummond and Rambaut, 2007), which aggregates the results across all states. We then used the combined results to perform downstream analyses.

The XML files required to run BEAST2 are provided as Source code 1.

Phylogenetic trees

Request a detailed protocol

After combining acceptable replicates, we obtained 12,382–64,818 phylogenies per group/gene region (mean 34,499). These trees are provided in https://doi.org/10.5061/dryad.zcrjdfnrz. We used TreeAnnotator version 2.6.3 (Drummond and Rambaut, 2007) to summarize each set of possible trees as a maximum clade credibility tree, which is the tree that maximizes the product of posterior clade probabilities. Since BEAST2 samples trees from the posterior, one could in principle perform model testing directly from the posterior samples; the complete set of trees can typically be reduced to a smaller 95% credible set of trees representing the ‘true’ tree (BEA, 2024). However, given the high complexity of the model space, all our posterior trees were unique, meaning this was not possible in practice. (Since the prior over tree topologies is unstructured, this effectively puts minuscule prior weight on trees with monophyly. Thus, sampling directly from the posterior provides an unacceptably high-variance estimator.).

Gene conversion

Request a detailed protocol

We calculated gene conversion fragments using GENECONV version 1.81a (Sawyer, 1999) on each alignment. It is generally advisable to use only synonymous sites when running the program on a protein-coding alignment, since silent sites within the same codon position are likely to be correlated. However, the extreme polymorphism in these MHC genes meant there were too few silent sites to use in the analysis. Thus, we considered all sites but caution that this could slightly overestimate the lengths of our inferred conversion tracts. However, we were mainly concerned with the presence of a conversion tract rather than its precise length. For each alignment, we ran GENECONV with options ListPairs, Allouter, Numsims = 10000, and Startseed = 3 10. We collected all inferred ‘Global Inner’ (GI) fragments with $s i m_p v a l < 0.05$ (this is pre-corrected for multiple comparisons by the program). GI fragments represent a possible gene conversion event between two sequences in the alignment.

For each GI fragment, we made an educated guess on which sequence was the donor sequence and which was the acceptor sequence by comparing how sequences clustered using sites within the fragment bounds to how sequences clustered using sites outside of the fragment bounds (but within the same exon). Sequences that were determined to be acceptor sequences were excluded from the Bayes factor analyses for the relevant exon because their non-tree-like behavior has the potential to bias results. For sequence pairs where the direction could not be determined, both sequences were excluded from subsequent analyses.

Bayes factors

Request a detailed protocol

Because we could not perform model testing directly on the full phylogenies, we used an alternative approach—computing Bayes factors for TSP within manageable subsets of the data, i.e. quartets of alleles. Let $D$ be a sample of phylogenies from BEAST2, sampled from the posterior with uniform prior. For a chosen species, we have a null hypothesis $H$ , that human alleles form a monophyletic group, and an alternative hypothesis, $H^{c}$ , that is also the complement of $H$ —that the human alleles do not form a monophyletic group. The Bayes factor, $K$ , is a ratio quantifying support for the alternative hypothesis:

K = \frac{P r (D | H^{c})}{Pr (D | H)} = \frac{P r (H^{c} | D)}{P r (H | D)} \cdot \frac{P r (H)}{P r (H^{c})}

where the first term on the right-hand side is the posterior odds in favor of the alternative hypothesis and the second term is the prior odds in favor of the null hypothesis. Bayes factors above 100 are considered decisive support for the alternative hypothesis (Jeffreys, 1998).

Because it is difficult to evaluate monophyly using a large number of alleles, we evaluate Bayes factors considering four alleles at a time: two alleles of a single species and two alleles of different species. For example, to assess support for TSP between humans and chimpanzees, we could use two human alleles and two bonobo alleles. Or, to assess support for TSP between humans and OWM, we could use two human alleles, one baboon, and one macaque allele. Because there are many possible sets of four alleles for each comparison, we tested a large number of quartets. We reported the maximum Bayes factor among all tested allele sets to represent evidence for TSP for that species comparison, because our aim was to find any evidence of TSP among any set of four alleles.

Next, we calculated the prior odds of the null hypothesis (that the chosen species, i.e. humans, form a monophyletic group). The prior odds $\frac{P r (H)}{P r (H^{c})} = \frac{1}{2}$ , because if the trees were assembled at random, there is one possible unrooted tree where the two human alleles would form a monophyletic group and two possible unrooted trees where the two human alleles would not form a monophyletic group, as shown in Figure 7.

Figure 7

Download asset Open asset

Possible unrooted trees of 4 alleles.

There is one tree where the human alleles are monophyletic, and two trees where they are non-monophyletic.

The data, $D$ , is the set of BEAST2 trees, so the posterior odds $\frac{P r (H^{c} | D)}{P r (H | D)}$ is the fraction of BEAST2 trees where the two human alleles do not form a monophyletic group divided by the fraction of BEAST2 trees where the two human alleles do form a monophyletic group. If either fraction is 0, we set its probability to $p = \frac{1}{n + 1}$ , where $n$ is the number of BEAST2 trees for that gene/sequence subset, and set the complement’s probability to $1 - p$ . This is the reason that some labels in Figures 3 and 4 contain a > sign (e.g. if no trees in a set of 14,000 were monophyletic, then the Bayes factor must be at minimum 7,000).

Bayes factors $K$ were then computed as follows and interpreted according to the scale given by Jeffreys, 1998.

K = \frac{P r (H^{c} | D)}{P r (H | D)} \cdot \frac{1}{2}

For each gene and genic region, we tested for TSP between human and chimpanzee, gorilla, orangutan, gibbon, OWM, and NWM. For the Class II genes, which have orthologs beyond the primates, we also tested for TSP between human and tarsier, Strepsirrhini, Flying Lemur, Treeshrew, Glires, Laurasiatheria, and Atlantogenata. For the Class I genes, we considered outgroup species to be the tarsier and Strepsirrhini.

Rapidly-evolving sites

Request a detailed protocol

BEAST2 places sites into partitions and estimates evolutionary rates for each partition. We averaged these rates over all sampled states, resulting in an overall (relative) evolutionary rate for each nucleotide position. We then normalized these rates. In designing the gene groups, we included common ‘backbone sequences’ in every set, although we expanded one particular clade for each focused tree. The inclusion of backbone genes spanning the whole MHC family caused the alignments to contain many gaps. To normalize the rates, we took advantage of the fact that every alignment had many mostly-gap sites. We defined ‘gappy’ sites as those in which the alignment had a gap in all of the following human backbone alleles: HLA-A*01:01:01:01, HLA-B*07:02:01:01, HLA-C*01:02:01:01, HLA-E*01:01:01:01, HLA-F*01:01:01:01, HLA-G*01:01:01:01, and HLA-J*01:01:01:01 for Class I, HLA-DRA*01:01:01:01, HLA-DQA1*01:01:01:01, HLA-DPA1*01:03:01:01, HLA-DMA*01:01:01:01, and HLA-DOA*01:01:01:01 for Class IIA, and HLA-DRB1*01:01:01:01, HLA-DQB1*02:01:01:01, HLA-DPB1*01:01:01:01, HLA-DMB*01:01:01:01, and HLA-DOB*01:01:01:01 for Class IIB. Because mostly-gap sites are not expected to affect the BEAST2 run very much and were common to all focused gene group alignments, we considered these sites’ BEAST2 evolutionary rates as a baseline. Since the rates obtained from SubstBMA are relative anyway, we simply needed a set of sites that behaved similarly across each BEAST2 run so that we could normalize the rates in a consistent manner. For each group and gene region, we calculated the mean rate among all of these baseline gappy sites. Then, we expressed normalized per-site rates as fold changes by taking the base-2 logarithm of the ratio of each site’s evolutionary rate to the baseline mean.

Protein structures

Request a detailed protocol

To map the rapidly-evolving sites onto protein structures, we first translated our nucleotide alignments into protein sequences using Expasy translate (Gasteiger et al., 2003). We then aligned our translated sequences with amino acid sequences from selected Protein Data Bank (PDB) (Berman et al., 2000) (https://www.rcsb.org/) structures (Table 5) using MUSCLE (Edgar, 2004) with default settings.

Table 5

Structures used to calculate distances to peptide.

This table lists the Protein Data Bank (Berman et al., 2000) structure codes and references for all structures used to calculate peptide distances.

Gene	Struct.	Reference
MHC-A	1ZVS	Chu et al., 2007
	3JTT	Dai et al., 2010
	3OX8	Liu et al., 2011
	3OXR	Liu et al., 2011
	3OXS	Liu et al., 2011
	3RL2	Zhang et al., 2011
	4HX1	Niu et al., 2013
	6J1V	Zhu et al., 2019
	6J1W	Zhu et al., 2019
	6MPP	Flores-Solis et al., 2019
	6PBH	van de Sandt et al., 2019
	7SR0	Finton et al., 2023
	7SRK	Finton et al., 2023
	7WT5	Asa et al., 2022
	8I5C	Lu et al., 2023
MHC-B	1JGD	Hillig et al., 2004
	3BVN	Kumar et al., 2009
	3KPL	Macdonald et al., 2009
	3KPN	Macdonald et al., 2009
	3LN4	Bade-Doding et al., 2011
	3LN5	Bade-Doding et al., 2011
	3RWJ	Wu et al., 2011
	3W39	Yagita et al., 2013
	3X13	Saunders et al., 2015
	4JQV	Rist et al., 2013
	4JRY	Liu et al., 2013
	4MJI	Motozono et al., 2014
	4O2E	Sun et al., 2014
	4PRA	Liu et al., 2014
	4PRB	Liu et al., 2014
	5EO0	Du et al., 2016
	5IEK	Alpizar et al., 2016
	5VUD	Illing et al., 2018
	5VVP	Illing et al., 2018
	5VWF	Illing et al., 2018
	6IWG	Yamamoto et al., 2019
	6MTM	Grant et al., 2018
	6PYL	Lim Kam Sian et al., 2019
	6PYV	Lim Kam Sian et al., 2019
	6UZP	Schutte et al., 2020
	6VIU	Schutte et al., 2020
	6Y27	Loll et al., 2020
	7R7V	Li et al., 2023
	7T0L	Vivian and Rossjohn, 2022
	7TUC	Jiang et al., 2022a
	7X1 C	Huan et al., 2023
	7YG3	Jiang et al., 2022b
MHC-C	4NT6	Choo et al., 2014
	5VGD	Kaur et al., 2017
	5VGE	Kaur et al., 2017
	5W67	Mobbs et al., 2017
	6PAG	Moradi et al., 2021
	7WJ3	Asa et al., 2022
MHC-E	2ESV	Hoare et al., 2006
	3CDG	Petrie et al., 2008
	5W1V	Sullivan et al., 2017
	7P49	Walters et al., 2022
	7P4B	Walters et al., 2022
MHC-F	5IUE	Dulberger et al., 2017
MHC-G	1YDP	Clements et al., 2005
	2DYP	Shiroishi et al., 2006
	3KYN	Walpole et al., 2010
MHC-DM	1HDM	Mosyak et al., 1998
	2BC4	Nicholson et al., 2006
	4FQX	Pos et al., 2012
	4GBX	Pos et al., 2012
	4I0P	Guce et al., 2013
MHC-DO	4I0P	Guce et al., 2013
MHC-DP	3LQZ	Dai et al., 2010
	3WEX	Kusano et al., 2014
	7T2A	Ciacchi et al., 2023
	7T6I	Klobuch et al., 2022
	7ZAK	Racle et al., 2023
MHC-DQ	2NNA	Henderson et al., 2007
	4D8P	Tollefsen et al., 2012
	5KSA	Petersen et al., 2016
	5KSU	Nguyen et al., 2017
	6DIG	Jiang et al., 2019
	6PX6	Ting et al., 2020
MHC-DR	1BX2	Smith et al., 1998
	1FV1	Li et al., 2000
	1H15	Lang et al., 2002
	1T5X	Zavala-Ruiz et al., 2004
	2Q6W	Parry et al., 2007
	3C5J	Dai et al., 2008
	4FQX	Pos et al., 2012
	4H1L	Yin et al., 2012
	5JLZ	Gerstner et al., 2016
	5V4M	Ooi et al., 2017
	6ATF	Scally et al., 2017
	8EUQ	Kassardjian et al., 2023

The translated alignments and the PDB structures’ sequences allowed us to determine which nucleotides in the alignment corresponded to which amino acids in each structure. We then calculated per-amino-acid evolutionary rates by averaging the per-site evolutionary rates among the sites composing each codon. We caution that the vast majority of PDB structures we considered were from human (and were functional), and thus they cannot capture all of the indels and structural changes that define different primate proteins. Additionally, some of our alignments included pseudogenes and diverse sets of related genes, for which it may seem reductionist to map to a single human protein structure. However, we note that despite significant sequence variation, the structures of different MHC proteins (from different genes within the same class) are extremely similar, and thus we do not expect such complications to alter the overall conclusions.

We used PyMOL version 2.4.2 (Schrödinger, LLC, 2021) to visualize the per-amino acid evolutionary rates on each gene’s protein structure. We used model 4BCE (Teze et al., 2014) for HLA-B, 4NT6 (Choo et al., 2014) for HLA-C, 7P4B (Walters et al., 2022) for HLA-E, 5JLZ (Gerstner et al., 2016) for HLA-DR and 2NNA (Henderson et al., 2007) for HLA-DQ from Protein Data Bank (Berman et al., 2000) to prepare the main figures (https://www.rcsb.org/).

We calculated the distances between all atoms of all amino acids of the HLA molecule and all atoms of all amino acids of the peptide in PyMOL, then took the minimum distance to represent each amino acid’s distance to the peptide. Where possible, we averaged the minimum distances over multiple alternative structures for each protein, to avoid relying too heavily on a particular structure. The structures chosen (Table 5) represented different alleles, different contexts (e.g. bound to a receptor or not, bound to a self or non-self peptide), and in a few cases different species.

Disease and trait literature

Request a detailed protocol

We conducted a literature search for papers that used HLA fine-mapping to discover disease and trait associations, limiting our selection to those including at least 1000 cases and which identified putatively independent signals via conditional analysis. We included all independent amino acid signals identified as significant by the original authors. If there was more than one study for the same disease/trait, but in different populations, we included all unique independent hits. We also collected associations between amino acids and TCR phenotypes.

For Figure 6—figure supplement 7, we counted the number of significant, unique trait associations for each amino acid and plotted them against each amino acid’s evolutionary rate. We drew simple linear regression lines to evaluate the association for each gene. We caution that this simple analysis does not take into account the significance level of each hit nor the ranking (e.g. top hit, second independent hit) of the associations, and that each study’s authors may have performed conditional analyses differently. References are listed in Table 1 and in Table 1—source data 1.

Appendix 1

MHC Nomenclature

Appendix 1—figure 1

Download asset Open asset

MHC allele nomenclature.

(A) Human HLA alleles are named in a standard fashion, with the gene name followed by four colon-separated fields. The first field indicates a broad-scale allele group which sometimes corresponds to a serological antigen. The second field denotes a specific HLA protein. The third field indicates synonymous changes to the nucleotide sequence in the coding region, while the fourth field is used to distinguish alleles with differences in the noncoding regions. If an allele’s expression has been characterized, an informative suffix is sometimes added (Robinson et al., 2024; Marsh et al., 2010). (B) Researchers have applied the same format to non-human alleles, with some key differences. Instead of ‘HLA’, a prefix which concatenates the first two letters of the genus name with the first two letters of the species name is used, except in certain cases where the species’ MHC system was named long ago. Paralogs can be distinguished using numbers, but sequences unassigned to a particular locus or paralog might incorporate a ‘W’ in the gene name. Use of expression tags varies, with some being added to the end of the gene name instead of the end of the entire allele name. Pseudogenes can be denoted with gene name suffixes, gene names themselves, expression suffixes, or not at all. For both human and non-human alleles, the lack of an expression suffix does not imply normal expression (de Groot et al., 2020). SLA: Swine Leukocyte Antigen; Chsa: *Chlorocebus sabaeus*—green monkey; Lero: *Leontopithecus rosalia*—golden lion tamarin; Mamu: *Macaca mulatta*—rhesus macaque; Aotr: *Aotus trivirgatus*—three-striped night monkey; Popy: *Pongo pygmaeus*—Bornean orangutan; Ceat: *Cercocebus atys*—sooty mangabey; Sala: *Saguinus labiatus*—white-lipped tamarin; Gogo: *Gorilla gorilla*—Western gorilla; Rano: *Rattus norvegicus*—brown rat; Patr: *Pan troglodytes*—chimpanzee; Papa: *Pan paniscus*—bonobo.

The large number of genes, some with thousands of alleles, necessitates a consistent naming scheme (Appendix 1—figure 1). Known alleles are given names such as ‘Aole-DQB1*23:01’, and names are maintained and updated by the WHO Nomenclature Committee for Factors of the HLA System (Robinson et al., 2024; Marsh et al., 2010). First, the species of origin is indicated by a four-letter prefix consisting of the first two letters of the genus name and the first two letters of the species name, for example ‘Chsa-’ for Chlorocebus sabaeus, the green monkey. There are some exceptions, usually because these MHC systems were first investigated before the naming scheme was put into place. These include ‘HLA-’ for human, ‘H2-’ for mouse, ‘RT1-’ for rat, and ‘SLA-’ for swine, among others (de Groot et al., 2020; de Groot et al., 2012).

After the hyphen is the locus designation. Some species, such as humans, have a relatively simple landscape of MHC genes, making it easy to identify sequences that belong to a particular gene. However, other species have recent gene expansions and considerable region conformation diversity, making it difficult to assign alleles to genes. In some cases, these are given generic locus designations; for example, rhesus macaques have at least 19 paralogous B loci, but most are given the ambiguous name ‘Mamu-B’, with the exception of a few well-characterized genes such as Mamu-B17. In other cases, unassigned sequences are given a working designation indicated by a ‘W’, such as ‘Popy-DRB*W113:01’. In this example, the allele definitely belongs to a DRB paralog, but it is unclear which one. Some locus names are given a ‘Ps’ suffix to indicate they are pseudogenes, such as ‘Caja-G5Ps’. However, not all pseudogenes are labeled this way, so one should not assume the lack of a ‘Ps’ suffix means a gene is functional (de Groot et al., 2020; de Groot et al., 2012).

After the species and locus name, each MHC allele is designated by up to four fields separated by colons. The first field designates the type or family. Types often, but not always, correspond to the broad serological reactivity of the allele, as many were named before full sequences were known. To facilitate comparison across closely related species, researchers generally try to give related MHC alleles the same first-field designation, for example Gogo-A*02 and HLA-A*02. However, certain genes do not follow this general rule. For example, MHC-DPB1 has undergone considerable gene conversion, resulting in no distinct types; thus, a shared first-field designation between species is meaningless for this gene (de Groot et al., 2012; de Groot et al., 2020). The second field designates the allele subtype, or unique amino acid sequence. For example, ‘Patr-A*08:01’ and ‘Patr-A*08:02’ are part of the same allelic family, but have some nonsynonymous differences. Synonymous changes are specified by the third field. For example, ‘Paan-DPB1*03:01:01’ and ‘Paan-DPB1*03:01:02’ have silent substitutions which ultimately result in the same protein. Lastly, the fourth field is used to describe changes to the noncoding regions—that is, the 5’ and 3’ UTRs and the introns. Of course, this requires that these regions have been sequenced, so not all alleles will have a fourth field. Finally, alleles can also be followed by an optional suffix to describe expression changes, most commonly ‘N’ for a null/nonexpressed allele or ‘L’ for a lowly-expressed allele (Hurley, 2021; Douillard et al., 2021).

In general, caution must be taken in interpreting allele names. First, because not all alleles are resolved at three- and four-field resolution, the names are not all strictly hierarchical; alleles which have all four fields cannot always simply be truncated to obtain the two-field version. Second, because human alleles were named in order of discovery, alleles with very different one- or two-field designations could ultimately have the same nucleotide or amino acid sequence in the peptide-binding groove. When discussing functional consequences, it is relevant to group alleles by their nucleotide or amino acid sequence in the PBR (designated G- and P-groups, respectively) and not necessarily by their one- or two-field name (Hurley, 2021; Douillard et al., 2021). Additionally, the suffixes can be misleading because not every allele has had its expression level characterized—the absence of an ‘L’ does not mean that an allele has normal expression (Hurley, 2021). Despite these small issues, the naming system is generally intuitive and very useful for understanding alleles at a glance. In this work, alleles obtained from the IPD-MHC and IPD-IMGT/HLA databases are named this way, but sequences obtained from RefSeq are labeled by accession number or location in a genome.

Data availability

The current manuscript is a computational study, and all data used is publicly available. Supplementary file 1 and Source code 1 contains lists of alleles used in this study and xml files for running BEAST2, respectively. Sets of posterior trees from BEAST2 for each gene group and gene region are available at https://doi.org/10.5061/dryad.zcrjdfnrz.

The following data sets were generated

1. Fortier AL
2. Pritchard JK
(2025) Dryad Digital Repository
The primate Major Histocompatibility Complex: Sets of posterior trees from BEAST2 for each gene group and region.

https://doi.org/10.5061/dryad.zcrjdfnrz

References

1. Abi-Rached L
2. Kuhl H
3. Roos C
4. ten Hallers B
5. Zhu B
6. Carbone L
7. de Jong PJ
8. Mootnick AR
9. Knaust F
10. Reinhardt R
11. Parham P
12. Walter L
(2010) A small, variable, and irregular killer cell Ig-like receptor locus accompanies the absence of MHC-C and MHC-G in gibbons
Journal of Immunology 184:1379–1391.

https://doi.org/10.4049/jimmunol.0903016
- PubMed
- Google Scholar
1. Adams EJ
2. Parham P
(2001) Species‐specific evolution of MHC class I genes in the higher primates
Immunological Reviews 183:41–64.

https://doi.org/10.1034/j.1600-065x.2001.1830104.x
- Google Scholar
1. Adams EJ
2. Luoma AM
(2013) The adaptable major histocompatibility complex (MHC) fold: structure and function of nonclassical and MHC class I-like molecules
Annual Review of Immunology 31:529–561.

https://doi.org/10.1146/annurev-immunol-032712-095912
- PubMed
- Google Scholar
Preprint
(2022) Population genomics of stone age eurasia
bioRxiv.

https://doi.org/10.1101/2022.05.04.490594
- Google Scholar
Data
(authors) (2016) Structure of HLA-B*40:02 in complex with the endogenous peptide REFSKEPEL
Worldwide Protein Data Bank.

https://doi.org/10.2210/pdb5IEK/pdb
1. Arden B
2. Klein J
(1982) Biochemical comparison of major histocompatibility complex molecules from different subspecies of Mus musculus: evidence for trans-specific evolution of alleles
PNAS 79:2342–2346.

https://doi.org/10.1073/pnas.79.7.2342
- PubMed
- Google Scholar
1. Asa M
2. Morita D
3. Kuroha J
4. Mizutani T
5. Mori N
6. Mikami B
7. Sugita M
(2022) Crystal structures of N-myristoylated lipopeptide-bound HLA class I complexes indicate reorganization of B-pocket architecture upon ligand binding
The Journal of Biological Chemistry 298:102100.

https://doi.org/10.1016/j.jbc.2022.102100
- PubMed
- Google Scholar
(2015) A global reference for human genetic variation
Nature 526:68–74.

https://doi.org/10.1038/nature15393
- PubMed
- Google Scholar
(2015) Trans-species polymorphism in humans and the great apes is generally maintained by balancing selection that modulates the host immune response
Human Genomics 9:21.

https://doi.org/10.1186/s40246-015-0043-1
- PubMed
- Google Scholar
(2011) The impact of human leukocyte antigen (HLA) micropolymorphism on ligand specificity within the HLA-B*41 allotypic family
Haematologica 96:110–118.

https://doi.org/10.3324/haematol.2010.030924
- Google Scholar
1. Barker DJ
2. Maccari G
3. Georgiou X
4. Cooper MA
5. Flicek P
6. Robinson J
7. Marsh SGE
(2023) The IPD-IMGT/HLA Database
Nucleic Acids Research 51:D1053–D1060.

https://doi.org/10.1093/nar/gkac1011
- PubMed
- Google Scholar
Software
1. BEA
(2024) Summarizing posterior trees, version v.2.7.8
Centre for Computational Evolution.

https://www.beast2.org/summarizing-posterior-trees/
1. Bergeron LA
2. Besenbacher S
3. Zheng J
4. Li P
5. Bertelsen MF
6. Quintard B
7. Hoffman JI
8. Li Z
9. St Leger J
10. Shao C
11. Stiller J
12. Gilbert MTP
13. Schierup MH
14. Zhang G
(2023) Evolution of the germline mutation rate across vertebrates
Nature 615:285–291.

https://doi.org/10.1038/s41586-023-05752-y
- PubMed
- Google Scholar
1. Berman HM
2. Westbrook J
3. Feng Z
4. Gilliland G
5. Bhat TN
6. Weissig H
7. Shindyalov IN
8. Bourne PE
(2000) The protein data bank
Nucleic Acids Research 28:235–242.

https://doi.org/10.1093/nar/28.1.235
- PubMed
- Google Scholar
1. Bhatia G
2. Patterson N
3. Pasaniuc B
4. Zaitlen N
5. Genovese G
6. Pollack S
7. Mallick S
8. Myers S
9. Tandon A
10. Spencer C
11. Palmer CD
12. Adeyemo AA
13. Akylbekova EL
14. Cupples LA
15. Divers J
16. Fornage M
17. Kao WHL
18. Lange L
19. Li M
20. Musani S
21. Mychaleckyj JC
22. Ogunniyi A
23. Papanicolaou G
24. Rotimi CN
25. Rotter JI
26. Ruczinski I
27. Salako B
28. Siscovick DS
29. Tayo BO
30. Yang Q
31. McCarroll S
32. Sabeti P
33. Lettre G
34. De Jager P
35. Hirschhorn J
36. Zhu X
37. Cooper R
38. Reich D
39. Wilson JG
40. Price AL
(2011) Genome-wide comparison of African-ancestry populations from CARe and other cohorts reveals signals of natural selection
American Journal of Human Genetics 89:368–381.

https://doi.org/10.1016/j.ajhg.2011.07.025
- PubMed
- Google Scholar
1. Bouckaert R
2. Heled J
3. Kühnert D
4. Vaughan T
5. Wu C-H
6. Xie D
7. Suchard MA
8. Rambaut A
9. Drummond AJ
(2014) BEAST 2: a software platform for Bayesian evolutionary analysis
PLOS Computational Biology 10:e1003537.

https://doi.org/10.1371/journal.pcbi.1003537
- PubMed
- Google Scholar
1. Bouckaert R
2. Vaughan TG
3. Barido-Sottani J
4. Duchêne S
5. Fourment M
6. Gavryushkina A
7. Heled J
8. Jones G
9. Kühnert D
10. De Maio N
11. Matschiner M
12. Mendes FK
13. Müller NF
14. Ogilvie HA
15. du Plessis L
16. Popinga A
17. Rambaut A
18. Rasmussen D
19. Siveroni I
20. Suchard MA
21. Wu C-H
22. Xie D
23. Zhang C
24. Stadler T
25. Drummond AJ
(2019) BEAST 2.5: An advanced software platform for Bayesian evolutionary analysis
PLOS Computational Biology 15:e1006650.

https://doi.org/10.1371/journal.pcbi.1006650
- PubMed
- Google Scholar
(1996) The MHC class I genes of the rhesus monkey: Different evolutionary histories of MHC class I and II genes in primates
The Journal of Immunology 156:4656–4665.

https://doi.org/10.4049/jimmunol.156.12.4656
- Google Scholar
1. Brändle U
2. Ono H
3. Vincek V
4. Klein D
5. Golubic M
6. Grahovac B
7. Klein J
(1992) Trans-species evolution of Mhc-DRB haplotype polymorphism in primates: organization of DRB genes in the chimpanzee
Immunogenetics 36:39–48.

https://doi.org/10.1007/BF00209291
- PubMed
- Google Scholar
1. Brandt DYC
2. César J
3. Goudet J
4. Meyer D
(2018) The effect of balancing selection on population differentiation: a study with HLA Genes
G3: Genes, Genomes, Genetics 8:2805–2815.

https://doi.org/10.1534/g3.118.200367
- Google Scholar
1. Bruijnesteijn J
(2023) HLA/MHC and KIR characterization in humans and non-human primates using Oxford nanopore technologies and pacific biosciences sequencing platforms
HLA 101:205–221.

https://doi.org/10.1111/tan.14957
- PubMed
- Google Scholar
1. Buniello A
2. MacArthur JAL
3. Cerezo M
4. Harris LW
5. Hayhurst J
6. Malangone C
7. McMahon A
8. Morales J
9. Mountjoy E
10. Sollis E
11. Suveges D
12. Vrousgou O
13. Whetzel PL
14. Amode R
15. Guillen JA
16. Riat HS
17. Trevanion SJ
18. Hall P
19. Junkins H
20. Flicek P
21. Burdett T
22. Hindorff LA
23. Cunningham F
24. Parkinson H
(2019) The NHGRI-EBI GWAS Catalog of published genome-wide association studies, targeted arrays and summary statistics 2019
Nucleic Acids Research 47:D1005–D1012.

https://doi.org/10.1093/nar/gky1120
- PubMed
- Google Scholar
1. Butler-Laporte G
2. Farjoun J
3. Nakanishi T
4. Lu T
5. Abner E
6. Chen Y
7. Hultström M
8. Metspalu A
9. Milani L
10. Mägi R
11. Nelis M
12. Hudjashov G
13. Yoshiji S
14. Ilboudo Y
15. Liang KYH
16. Su C-Y
17. Willet JDS
18. Esko T
19. Zhou S
20. Forgetta V
21. Taliun D
22. Richards JB
23. Estonian Biobank Research Team
(2023) HLA allele-calling using multi-ancestry whole-exome sequencing from the UK Biobank identifies 129 novel associations in 11 autoimmune diseases
Communications Biology 6:1–17.

https://doi.org/10.1038/s42003-023-05496-5
- Google Scholar
1. Cagliani R
2. Fumagalli M
3. Biasin M
4. Piacentini L
5. Riva S
6. Pozzoli U
7. Bonaglia MC
8. Bresolin N
9. Clerici M
10. Sironi M
(2010) Long-term balancing selection maintains trans-specific polymorphisms in the human TRIM5 gene
Human Genetics 128:577–588.

https://doi.org/10.1007/s00439-010-0884-6
- PubMed
- Google Scholar
1. Cagliani R
2. Guerini FR
3. Fumagalli M
4. Riva S
5. Agliardi C
6. Galimberti D
7. Pozzoli U
8. Goris A
9. Dubois B
10. Fenoglio C
11. Forni D
12. Sanna S
13. Zara I
14. Pitzalis M
15. Zoledziewska M
16. Cucca F
17. Marini F
18. Comi GP
19. Scarpini E
20. Bresolin N
21. Clerici M
22. Sironi M
(2012) A trans-specific polymorphism in ZC3HAV1 is maintained by long-standing balancing selection and may confer susceptibility to multiple sclerosis
Molecular Biology and Evolution 29:1599–1613.

https://doi.org/10.1093/molbev/mss002
- PubMed
- Google Scholar
1. Cheng Y
2. Grueber C
3. Hogg CJ
4. Belov K
(2022) Improved high-throughput MHC typing for non-model species using long-read sequencing
Molecular Ecology Resources 22:862–876.

https://doi.org/10.1111/1755-0998.13511
- PubMed
- Google Scholar
1. Choo JAL
2. Liu J
3. Toh X
4. Grotenbreg GM
5. Ren EC
(2014) The immunodominant influenza A virus M158-66 cytotoxic T lymphocyte epitope exhibits degenerate class I major histocompatibility complex restriction in humans
Journal of Virology 88:10613–10623.

https://doi.org/10.1128/JVI.00855-14
- PubMed
- Google Scholar
1. Chu F
2. Lou Z
3. Chen YW
4. Liu Y
5. Gao B
6. Zong L
7. Khan AH
8. Bell JI
9. Rao Z
10. Gao GF
(2007) First glimpse of the peptide presentation by rhesus macaque MHC class I: crystal structures of Mamu-A*01 complexed with two immunogenic SIV epitopes and insights into CTL escape
Journal of Immunology 178:944–952.

https://doi.org/10.4049/jimmunol.178.2.944
- PubMed
- Google Scholar
1. Ciacchi L
2. van de Garde MDB
3. Ladell K
4. Farenc C
5. Poelen MCM
6. Miners KL
7. Llerena C
8. Reid HH
9. Petersen J
10. Price DA
11. Rossjohn J
12. van Els C
(2023) CD4⁺ T cell-mediated recognition of a conserved cholesterol-dependent cytolysin epitope generates broad antibacterial immunity
Immunity 56:1082–1097.

https://doi.org/10.1016/j.immuni.2023.03.020
- PubMed
- Google Scholar
1. Clements CS
2. Kjer-Nielsen L
3. Kostenko L
4. Hoare HL
5. Dunstone MA
6. Moses E
7. Freed K
8. Brooks AG
9. Rossjohn J
10. McCluskey J
(2005) Crystal structure of HLA-G: a nonclassical MHC class I molecule expressed at the fetal-maternal interface
PNAS 102:3360–3365.

https://doi.org/10.1073/pnas.0409676102
- PubMed
- Google Scholar
1. Cong P-K
2. Bai W-Y
3. Li J-C
4. Yang M-Y
5. Khederzadeh S
6. Gai S-R
7. Li N
8. Liu Y-H
9. Yu S-H
10. Zhao W-W
11. Liu J-Q
12. Sun Y
13. Zhu X-W
14. Zhao P-P
15. Xia J-W
16. Guan P-L
17. Qian Y
18. Tao J-G
19. Xu L
20. Tian G
21. Wang P-Y
22. Xie S-Y
23. Qiu M-C
24. Liu K-Q
25. Tang B-S
26. Zheng H-F
(2022) Genomic analyses of 10,376 individuals in the Westlake BioBank for Chinese (WBBC) pilot project
Nature Communications 13:2939.

https://doi.org/10.1038/s41467-022-30526-x
- PubMed
- Google Scholar
1. Dai S
2. Crawford F
3. Marrack P
4. Kappler JW
(2008) The structure of HLA-DR52c: comparison to other HLA-DRB3 alleles
PNAS 105:11893–11897.

https://doi.org/10.1073/pnas.0805810105
- PubMed
- Google Scholar
1. Dai S
2. Murphy GA
3. Crawford F
4. Mack DG
5. Falta MT
6. Marrack P
7. Kappler JW
8. Fontenot AP
(2010) Crystal structure of HLA-DP2 and implications for chronic beryllium disease
PNAS 107:7425–7430.

https://doi.org/10.1073/pnas.1001772107
- PubMed
- Google Scholar
(2011) The variant call format and VCFtools
Bioinformatics 27:2156–2158.

https://doi.org/10.1093/bioinformatics/btr330
- PubMed
- Google Scholar
1. Darlay R
2. Ayers KL
3. Mells GF
4. Hall LS
5. Liu JZ
6. Almarri MA
7. Alexander GJ
8. Jones DE
9. Sandford RN
10. Anderson CA
11. Cordell HJ
(2018) Amino acid residues in five separate HLA genes can explain most of the known associations between the MHC and primary biliary cholangitis
PLOS Genetics 14:e1007833.

https://doi.org/10.1371/journal.pgen.1007833
- PubMed
- Google Scholar
1. de Groot NG
2. Otting N
3. Robinson J
4. Blancher A
5. Lafont BAP
6. Marsh SGE
7. O’Connor DH
8. Shiina T
9. Walter L
10. Watkins DI
11. Bontrop RE
(2012) Nomenclature report on the major histocompatibility complex genes and alleles of Great Ape, Old and New World monkey species
Immunogenetics 64:615–631.

https://doi.org/10.1007/s00251-012-0617-1
- PubMed
- Google Scholar
1. de Groot NG
2. Otting N
3. Maccari G
4. Robinson J
5. Hammond JA
6. Blancher A
7. Lafont BAP
8. Guethlein LA
9. Wroblewski EE
10. Marsh SGE
11. Shiina T
12. Walter L
13. Vigilant L
14. Parham P
15. O’Connor DH
16. Bontrop RE
(2020) Nomenclature report 2019: major histocompatibility complex genes and alleles of great and small Ape and old and new world monkey species
Immunogenetics 72:25–36.

https://doi.org/10.1007/s00251-019-01132-x
- PubMed
- Google Scholar
1. Dijkstra JM
2. Yamaguchi T
(2019) Ancient features of the MHC class II presentation pathway, and a model for the possible origin of MHC molecules
Immunogenetics 71:233–249.

https://doi.org/10.1007/s00251-018-1090-2
- Google Scholar
1. Dilthey AT
(2021) State-of-the-art genome inference in the human MHC
The International Journal of Biochemistry & Cell Biology 131:105882.

https://doi.org/10.1016/j.biocel.2020.105882
- PubMed
- Google Scholar
1. Douillard V
2. Castelli EC
3. Mack SJ
4. Hollenbach JA
5. Gourraud P-A
6. Vince N
7. Limou S
(2021) Approaching genetics through the MHC Lens: Tools and methods for HLA research
Frontiers in Genetics 12:774916.

https://doi.org/10.3389/fgene.2021.774916
- PubMed
- Google Scholar
1. Drummond AJ
2. Rambaut A
(2007) BEAST: Bayesian evolutionary analysis by sampling trees
BMC Evolutionary Biology 7:214.

https://doi.org/10.1186/1471-2148-7-214
- PubMed
- Google Scholar
1. Du VY
2. Bansal A
3. Carlson J
4. Salazar-Gonzalez JF
5. Salazar MG
6. Ladell K
7. Gras S
8. Josephs TM
9. Heath SL
10. Price DA
11. Rossjohn J
12. Hunter E
13. Goepfert PA
(2016) HIV-1-Specific CD8 T cells exhibit limited cross-reactivity during acute infection
Journal of Immunology 196:3276–3286.

https://doi.org/10.4049/jimmunol.1502411
- PubMed
- Google Scholar
1. Dulberger CL
2. McMurtrey CP
3. Hölzemer A
4. Neu KE
5. Liu V
6. Steinbach AM
7. Garcia-Beltran WF
8. Sulak M
9. Jabri B
10. Lynch VJ
11. Altfeld M
12. Hildebrand WH
13. Adams EJ
(2017) Human leukocyte antigen F presents peptides and regulates immunity through interactions with NK cell receptors
Immunity 46:1018–1029.

https://doi.org/10.1016/j.immuni.2017.06.002
- PubMed
- Google Scholar
1. Edgar RC
(2004) MUSCLE: multiple sequence alignment with high accuracy and high throughput
Nucleic Acids Research 32:1792–1797.

https://doi.org/10.1093/nar/gkh340
- PubMed
- Google Scholar
1. Ferguson W
2. Dvora S
3. Fikes RW
4. Stone AC
5. Boissinot S
(2012) Long-term balancing selection at the antiviral gene OAS1 in Central African chimpanzees
Molecular Biology and Evolution 29:1093–1103.

https://doi.org/10.1093/molbev/msr247
- PubMed
- Google Scholar
1. Ferreiro-Iglesias A
2. Lesseur C
3. McKay J
4. Hung RJ
5. Han Y
6. Zong X
7. Christiani D
8. Johansson M
9. Xiao X
10. Li Y
11. Qian DC
12. Ji X
13. Liu G
14. Caporaso N
15. Scelo G
16. Zaridze D
17. Mukeriya A
18. Kontic M
19. Ognjanovic S
20. Lissowska J
21. Szołkowska M
22. Swiatkowska B
23. Janout V
24. Holcatova I
25. Bolca C
26. Savic M
27. Ognjanovic M
28. Bojesen SE
29. Wu X
30. Albanes D
31. Aldrich MC
32. Tardon A
33. Fernandez-Somoano A
34. Fernandez-Tardon G
35. Le Marchand L
36. Rennert G
37. Chen C
38. Doherty J
39. Goodman G
40. Bickeböller H
41. Wichmann H-E
42. Risch A
43. Rosenberger A
44. Shen H
45. Dai J
46. Field JK
47. Davies M
48. Woll P
49. Teare MD
50. Kiemeney LA
51. van der Heijden EHFM
52. Yuan J-M
53. Hong Y-C
54. Haugen A
55. Zienolddiny S
56. Lam S
57. Tsao M-S
58. Johansson M
59. Grankvist K
60. Schabath MB
61. Andrew A
62. Duell E
63. Melander O
64. Brunnström H
65. Lazarus P
66. Arnold S
67. Slone S
68. Byun J
69. Kamal A
70. Zhu D
71. Landi MT
72. Amos CI
73. Brennan P
(2018) Fine mapping of MHC region in lung cancer highlights independent susceptibility loci by ethnicity
Nature Communications 9:3927.

https://doi.org/10.1038/s41467-018-05890-2
- PubMed
- Google Scholar
1. Field Y
2. Boyle EA
3. Telis N
4. Gao Z
5. Gaulton KJ
6. Golan D
7. Yengo L
8. Rocheleau G
9. Froguel P
10. McCarthy MI
11. Pritchard JK
(2016) Detection of human adaptation during the past 2000 years
Science 354:760–764.

https://doi.org/10.1126/science.aag0776
- PubMed
- Google Scholar
(1988) MHC polymorphism pre-dating speciation
Nature 335:265–267.

https://doi.org/10.1038/335265a0
- PubMed
- Google Scholar
1. Finton KAK
2. Rupert PB
3. Friend DJ
4. Dinca A
5. Lovelace ES
6. Buerger M
7. Rusnac DV
8. Foote-McNabb U
9. Chour W
10. Heath JR
11. Campbell JS
12. Pierce RH
13. Strong RK
(2023) Effects of HLA single chain trimer design on peptide presentation and stability
Frontiers in Immunology 14:1170462.

https://doi.org/10.3389/fimmu.2023.1170462
- PubMed
- Google Scholar
Data
(authors) (2019) HLA-a*01:01 complex with NRAS Q61K peptide by NMR
Worldwide Protein Data Bank.

https://doi.org/10.2210/pdb6MPP/pdb
(2023) A genomic timescale for placental mammal evolution
Science 380:eabl8189.

https://doi.org/10.1126/science.abl8189
- PubMed
- Google Scholar
1. Fortier AL
2. Pritchard JK
(2025) The primate major histocompatibility complex: an illustrative example of gene family evolution
eLife 14:RP103545.

https://doi.org/10.7554/eLife.103545
- Google Scholar
(2016) Non-self- and self-recognition models in plant self-incompatibility
Nature Plants 2:16130.

https://doi.org/10.1038/nplants.2016.130
- PubMed
- Google Scholar
1. Fukami-Kobayashi K
2. Shiina T
3. Anzai T
4. Sano K
5. Yamazaki M
6. Inoko H
7. Tateno Y
(2005) Genomic evolution of MHC class I region in primates
PNAS 102:9230–9234.

https://doi.org/10.1073/pnas.0500770102
- Google Scholar
1. Fuselli S
2. Baptista RP
3. Panziera A
4. Magi A
5. Guglielmi S
6. Tonin R
7. Benazzo A
8. Bauzer LG
9. Mazzoni CJ
10. Bertorelle G
(2018) A new hybrid approach for MHC genotyping: high-throughput NGS and long read MinION nanopore sequencing, with application to the non-model vertebrate Alpine chamois (Rupicapra rupicapra)
Heredity 121:293–303.

https://doi.org/10.1038/s41437-018-0070-5
- PubMed
- Google Scholar
1. Gasteiger E
2. Gattiker A
3. Hoogland C
4. Ivanyi I
5. Appel RD
6. Bairoch A
(2003) ExPASy: The proteomics server for in-depth protein knowledge and analysis
Nucleic Acids Research 31:3784–3788.

https://doi.org/10.1093/nar/gkg563
- PubMed
- Google Scholar
(1993) Evolutionary conservation of major histocompatibility complex-DR/peptide/T cell interactions in primates
The Journal of Experimental Medicine 177:979–987.

https://doi.org/10.1084/jem.177.4.979
- PubMed
- Google Scholar
Website
1. Genome Reference Consortium
(2022) Human Genome Region MHC
Accessed August 12, 2025.

https://www.ncbi.nlm.nih.gov/grc/human/regions/MHC
1. Gerstner C
2. Dubnovitsky A
3. Sandin C
4. Kozhukh G
5. Uchtenhagen H
6. James EA
7. Rönnelid J
8. Ytterberg AJ
9. Pieper J
10. Reed E
11. Tandre K
12. Rieck M
13. Zubarev RA
14. Rönnblom L
15. Sandalova T
16. Buckner JH
17. Achour A
18. Malmström V
(2016) Functional and structural characterization of a novel HLA-DRB1*04:01-Restricted α-Enolase T cell epitope in rheumatoid arthritis
Frontiers in Immunology 7:494.

https://doi.org/10.3389/fimmu.2016.00494
- PubMed
- Google Scholar
1. Gleimer M
2. Wahl AR
3. Hickman HD
4. Abi-Rached L
5. Norman PJ
6. Guethlein LA
7. Hammond JA
8. Draghi M
9. Adams EJ
10. Juo S
11. Jalili R
12. Gharizadeh B
13. Ronaghi M
14. Garcia KC
15. Hildebrand WH
16. Parham P
(2011) Although divergent in residues of the peptide binding site, conserved chimpanzee Patr-AL and polymorphic human HLA-A*02 have overlapping peptide-binding repertoires
Journal of Immunology 186:1575–1588.

https://doi.org/10.4049/jimmunol.1002990
- PubMed
- Google Scholar
1. Grant EJ
2. Josephs TM
3. Loh L
4. Clemens EB
5. Sant S
6. Bharadwaj M
7. Chen W
8. Rossjohn J
9. Gras S
10. Kedzierska K
(2018) Broad CD8⁺ T cell cross-recognition of distinct influenza A strains in humans
Nature Communications 9:5427.

https://doi.org/10.1038/s41467-018-07815-5
- PubMed
- Google Scholar
(2020) The unconventional role of HLA-E: The road less traveled
Molecular Immunology 120:101–112.

https://doi.org/10.1016/j.molimm.2020.02.011
- PubMed
- Google Scholar
1. Grimholt U
2. Tsukamoto K
3. Azuma T
4. Leong J
5. Koop BF
6. Dijkstra JM
(2015) A comprehensive analysis of teleost MHC class I sequences
BMC Evolutionary Biology 15:32.

https://doi.org/10.1186/s12862-015-0309-1
- PubMed
- Google Scholar
1. Guce AI
2. Mortimer SE
3. Yoon T
4. Painter CA
5. Jiang W
6. Mellins ED
7. Stern LJ
(2013) HLA-DO acts as a substrate mimic to inhibit HLA-DM by a competitive mechanism
Nature Structural & Molecular Biology 20:90–98.

https://doi.org/10.1038/nsmb.2460
- Google Scholar
(2015) Co-evolution of MHC class I and variable NK cell receptors in placental mammals
Immunological Reviews 267:259–282.

https://doi.org/10.1111/imr.12326
- PubMed
- Google Scholar
(1990) Allelic diversification at the class II DQB locus of the mammalian major histocompatibility complex
PNAS 87:1835–1839.

https://doi.org/10.1073/pnas.87.5.1835
- PubMed
- Google Scholar
(2017) Gorilla MHC class I gene and sequence variation in a comparative context
Immunogenetics 69:303–323.

https://doi.org/10.1007/s00251-017-0974-x
- PubMed
- Google Scholar
(2020) Comparative genetics of the major histocompatibility complex in humans and nonhuman primates
International Journal of Immunogenetics 47:243–260.

https://doi.org/10.1111/iji.12490
- PubMed
- Google Scholar
1. Henderson KN
2. Tye-Din JA
3. Reid HH
4. Chen Z
5. Borg NA
6. Beissbarth T
7. Tatham A
8. Mannering SI
9. Purcell AW
10. Dudek NL
11. van Heel DA
12. McCluskey J
13. Rossjohn J
14. Anderson RP
(2007) A Structural and immunological basis for the role of human leukocyte antigen DQ8 in celiac disease
Immunity 27:23–34.

https://doi.org/10.1016/j.immuni.2007.05.015
- Google Scholar
1. Hillig RC
2. Hülsmeyer M
3. Saenger W
4. Welfle K
5. Misselwitz R
6. Welfle H
7. Kozerski C
8. Volz A
9. Uchanska-Ziegler B
10. Ziegler A
(2004) Thermodynamic and structural analysis of peptide- and allele-dependent properties of two HLA-B27 subtypes exhibiting differential disease association
The Journal of Biological Chemistry 279:652–663.

https://doi.org/10.1074/jbc.M307457200
- PubMed
- Google Scholar
1. Hinks A
2. Bowes J
3. Cobb J
4. Ainsworth HC
5. Marion MC
6. Comeau ME
7. Sudman M
8. Han B
9. Becker ML
10. Bohnsack JF
11. de Bakker PIW
12. Haas JP
13. Hazen M
14. Lovell DJ
15. Nigrovic PA
16. Nordal E
17. Punnaro M
18. Rosenberg AM
19. Rygg M
20. Smith SL
21. Wise CA
22. Videm V
23. Wedderburn LR
24. Yarwood A
25. Yeung RSM
26. Prahalad S
27. Langefeld CD
28. Raychaudhuri S
29. Thompson SD
30. Thomson W
31. Juvenile Arthritis Consortium for Immunochip
(2017) Fine-mapping the MHC locus in juvenile idiopathic arthritis (JIA) reveals genetic heterogeneity corresponding to distinct adult inflammatory arthritic diseases
Annals of the Rheumatic Diseases 76:765–772.

https://doi.org/10.1136/annrheumdis-2016-210025
- PubMed
- Google Scholar
1. Hirata J
2. Hosomichi K
3. Sakaue S
4. Kanai M
5. Nakaoka H
6. Ishigaki K
7. Suzuki K
8. Akiyama M
9. Kishikawa T
10. Ogawa K
11. Masuda T
12. Yamamoto K
13. Hirata M
14. Matsuda K
15. Momozawa Y
16. Inoue I
17. Kubo M
18. Kamatani Y
19. Okada Y
(2019) Genetic and phenotypic landscape of the major histocompatibilty complex region in the Japanese population
Nature Genetics 51:470–480.

https://doi.org/10.1038/s41588-018-0336-0
- PubMed
- Google Scholar
1. Hoare HL
2. Sullivan LC
3. Pietra G
4. Clements CS
5. Lee EJ
6. Ely LK
7. Beddoe T
8. Falco M
9. Kjer-Nielsen L
10. Reid HH
11. McCluskey J
12. Moretta L
13. Rossjohn J
14. Brooks AG
(2006) Structural basis for a major histocompatibility complex class Ib-restricted T cell response
Nature Immunology 7:256–264.

https://doi.org/10.1038/ni1312
- PubMed
- Google Scholar
1. Huan X
2. Zhuo N
3. Lee HY
4. Ren EC
(2023) Allopurinol non-covalently facilitates binding of unconventional peptides to HLA-B*58:01
Scientific Reports 13:9373.

https://doi.org/10.1038/s41598-023-36293-z
- PubMed
- Google Scholar
1. Hughes AL
2. Nei M
(1988) Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection
Nature 335:167–170.

https://doi.org/10.1038/335167a0
- PubMed
- Google Scholar
1. Hughes AL
2. Nei M
(1989) Nucleotide substitution at major histocompatibility complex class II loci: evidence for overdominant selection
PNAS 86:958–962.

https://doi.org/10.1073/pnas.86.3.958
- PubMed
- Google Scholar
1. Hurley CK
(2021) Naming HLA diversity: A review of HLA nomenclature
Human Immunology 82:457–465.

https://doi.org/10.1016/j.humimm.2020.03.005
- Google Scholar
1. Igic B
2. Bohs L
3. Kohn JR
(2006) Ancient polymorphism reveals unidirectional breeding system shifts
PNAS 103:1359–1363.

https://doi.org/10.1073/pnas.0506283103
- PubMed
- Google Scholar
1. Illing PT
2. Pymm P
3. Croft NP
4. Hilton HG
5. Jojic V
6. Han AS
7. Mendoza JL
8. Mifsud NA
9. Dudek NL
10. McCluskey J
11. Parham P
12. Rossjohn J
13. Vivian JP
14. Purcell AW
(2018) HLA-B57 micropolymorphism defines the sequence and conformational breadth of the immunopeptidome
Nature Communications 9:4693.

https://doi.org/10.1038/s41467-018-07109-w
- PubMed
- Google Scholar
(1990) Polymorphism at the self-incompatibility locus in Solanaceae predates speciation
PNAS 87:9732–9735.

https://doi.org/10.1073/pnas.87.24.9732
- PubMed
- Google Scholar
1. Jain M
2. Koren S
3. Miga KH
4. Quick J
5. Rand AC
6. Sasani TA
7. Tyson JR
8. Beggs AD
9. Dilthey AT
10. Fiddes IT
11. Malla S
12. Marriott H
13. Nieto T
14. O’Grady J
15. Olsen HE
16. Pedersen BS
17. Rhie A
18. Richardson H
19. Quinlan AR
20. Snutch TP
21. Tee L
22. Paten B
23. Phillippy AM
24. Simpson JT
25. Loman NJ
26. Loose M
(2018) Nanopore sequencing and assembly of a human genome with ultra-long reads
Nature Biotechnology 36:338–345.

https://doi.org/10.1038/nbt.4060
- PubMed
- Google Scholar
Book
1. Jeffreys H
(1998)
The Theory of Probability

Oxford University Press.
- Google Scholar
1. Jiang W
2. Birtley JR
3. Hung S-C
4. Wang W
5. Chiou S-H
6. Macaubas C
7. Kornum B
8. Tian L
9. Huang H
10. Adler L
11. Weaver G
12. Lu L
13. Ilstad-Minnihan A
14. Somasundaram S
15. Ayyangar S
16. Davis MM
17. Stern LJ
18. Mellins ED
(2019) In vivo clonal expansion and phenotypes of hypocretin-specific CD4⁺ T cells in narcolepsy patients and controls
Nature Communications 10:5247.

https://doi.org/10.1038/s41467-019-13234-x
- PubMed
- Google Scholar
1. Jiang J
2. Taylor DK
3. Kim EJ
4. Boyd LF
5. Ahmad J
6. Mage MG
7. Truong HV
8. Woodward CH
9. Sgourakis NG
10. Cresswell P
11. Margulies DH
12. Natarajan K
(2022a) Structural mechanism of tapasin-mediated MHC-I peptide loading in antigen presentation
Nature Communications 13:1–13.

https://doi.org/10.1038/s41467-022-33153-8
- Google Scholar
1. Jiang H
2. Wang C-W
3. Wang Z
4. Dai Y
5. Zhu Y
6. Lee Y-S
7. Cao Y
8. Chung W-H
9. Ouyang S
10. Wang H
(2022b) Functional and structural characteristics of HLA-B*13:01-mediated specific T cells reaction in dapsone-induced drug hypersensitivity
Journal of Biomedical Science 29:1–21.

https://doi.org/10.1186/s12929-022-00845-8
- Google Scholar
1. Karl JA
2. Prall TM
3. Bussan HE
4. Varghese JM
5. Pal A
6. Wiseman RW
7. O’Connor DH
(2023) Complete sequencing of a cynomolgus macaque major histocompatibility complex haplotype
Genome Research 33:448–462.

https://doi.org/10.1101/gr.277429.122
- PubMed
- Google Scholar
1. Kassardjian A
2. Sun E
3. Sookhoo J
4. Muthuraman K
5. Boligan KF
6. Kucharska I
7. Rujas E
8. Jetha A
9. Branch DR
10. Babiuk S
11. Barber B
12. Julien JP
(2023) Modular adjuvant-free pan-HLA-DR-immunotargeting subunit vaccine against SARS-CoV-2 elicits broad sarbecovirus-neutralizing antibody responses
Cell Reports 42:112391.

https://doi.org/10.1016/j.celrep.2023.112391
- PubMed
- Google Scholar
1. Kaufman J
(2022) The new W family reconstructs the evolution of MHC genes
PNAS 119:119–121.

https://doi.org/10.1073/pnas.2122079119
- Google Scholar
1. Kaur G
2. Gras S
3. Mobbs JI
4. Vivian JP
5. Cortes A
6. Barber T
7. Kuttikkatte SB
8. Jensen LT
9. Attfield KE
10. Dendrou CA
11. Carrington M
12. McVean G
13. Purcell AW
14. Rossjohn J
15. Fugger L
(2017) Structural and regulatory diversity shape HLA-C protein expression levels
Nature Communications 8:15924.

https://doi.org/10.1038/ncomms15924
- PubMed
- Google Scholar
(2017) What has GWAS done for HLA and disease associations?
International Journal of Immunogenetics 44:195–211.

https://doi.org/10.1111/iji.12332
- PubMed
- Google Scholar
1. Kiryu I
2. Dijkstra JM
3. Sarder RI
4. Fujiwara A
5. Yoshiura Y
6. Ototake M
(2005) New MHC class Ia domain lineages in rainbow trout (Oncorhynchus mykiss) which are shared with other fish species
Fish & Shellfish Immunology 18:243–254.

https://doi.org/10.1016/j.fsi.2004.07.007
- Google Scholar
Book
1. Klein J
(1980)
Generation of diversity at MHC loci: implications for t-cell receptor repertoires

In: Fougereau M, Dausset J, editors. Immunology. Academic Press. pp. 239–253.
- PubMed
- Google Scholar
1. Klein J
(1987) Origin of major histocompatibility complex polymorphism: the trans-species hypothesis
Human Immunology 19:155–162.

https://doi.org/10.1016/0198-8859(87)90066-8
- PubMed
- Google Scholar
1. Klobuch S
2. Lim JJ
3. van Balen P
4. Kester MGD
5. de Klerk W
6. de Ru AH
7. Pothast CR
8. Jedema I
9. Drijfhout JW
10. Rossjohn J
11. Reid HH
12. van Veelen PA
13. Falkenburg JHF
14. Heemskerk MHM
(2022) Human T cells recognize HLA-DP–bound peptides in two orientations
PNAS 119:e2214331119.

https://doi.org/10.1073/pnas.2214331119
- Google Scholar
(2000) Convergent evolution of major histocompatibility complex molecules in humans and New World monkeys
Immunogenetics 51:169–178.

https://doi.org/10.1007/s002510050028
- Google Scholar
(2001) Independent origin of functional MHC class II genes in humans and new world monkeys
Human Immunology 62:1–14.

https://doi.org/10.1016/S0198-8859(00)00233-0
- Google Scholar
1. Krishna C
2. Chiou J
3. Sakaue S
4. Kang JB
5. Christensen SM
6. Lee I
7. Aksit MA
8. Kim HI
9. von Schack D
10. Raychaudhuri S
11. Ziemek D
12. Hu X
(2024) The influence of HLA genetic variation on plasma protein expression
Nature Communications 15:6469.

https://doi.org/10.1038/s41467-024-50583-8
- PubMed
- Google Scholar
1. Kuderna LFK
2. Gao H
3. Janiak MC
4. Kuhlwilm M
5. Orkin JD
6. Bataillon T
7. Manu S
8. Valenzuela A
9. Bergman J
10. Rousselle M
11. Silva FE
12. Agueda L
13. Blanc J
14. Gut M
15. de Vries D
16. Goodhead I
17. Harris RA
18. Raveendran M
19. Jensen A
20. Chuma IS
21. Horvath JE
22. Hvilsom C
23. Juan D
24. Frandsen P
25. Schraiber JG
26. de Melo FR
27. Bertuol F
28. Byrne H
29. Sampaio I
30. Farias I
31. Valsecchi J
32. Messias M
33. da Silva MNF
34. Trivedi M
35. Rossi R
36. Hrbek T
37. Andriaholinirina N
38. Rabarivola CJ
39. Zaramody A
40. Jolly CJ
41. Phillips-Conroy J
42. Wilkerson G
43. Abee C
44. Simmons JH
45. Fernandez-Duque E
46. Kanthaswamy S
47. Shiferaw F
48. Wu D
49. Zhou L
50. Shao Y
51. Zhang G
52. Keyyu JD
53. Knauf S
54. Le MD
55. Lizano E
56. Merker S
57. Navarro A
58. Nadler T
59. Khor CC
60. Lee J
61. Tan P
62. Lim WK
63. Kitchener AC
64. Zinner D
65. Gut I
66. Melin AD
67. Guschanski K
68. Schierup MH
69. Beck RMD
70. Umapathy G
71. Roos C
72. Boubli JP
73. Rogers J
74. Farh KK-H
75. Marques Bonet T
(2023) A global catalog of whole-genome diversity from 233 primate species
Science 380:906–913.

https://doi.org/10.1126/science.abn7829
- PubMed
- Google Scholar
(2009) Structural basis for T cell alloreactivity among three HLA-B14 and HLA-B27 antigens
The Journal of Biological Chemistry 284:29784–29797.

https://doi.org/10.1074/jbc.M109.038497
- PubMed
- Google Scholar
1. Kundu S
2. Faulkes CG
(2007) A tangled history: patterns of major histocompatibility complex evolution in the African mole-rats (Family: Bathyergidae)
Biological Journal of the Linnean Society 91:493–503.

https://doi.org/10.1111/j.1095-8312.2007.00814.x
- Google Scholar
(1992) Shared polymorphism between gorilla and human major histocompatibility complex DRB loci
Human Immunology 34:267–278.

https://doi.org/10.1016/0198-8859(92)90026-j
- PubMed
- Google Scholar
1. Kusano S
2. Kukimoto-Niino M
3. Satta Y
4. Ohsawa N
5. Uchikubo-Kamo T
6. Wakiyama M
7. Ikeda M
8. Terada T
9. Yamamoto K
10. Nishimura Y
11. Shirouzu M
12. Sasazuki T
13. Yokoyama S
(2014) Structural basis for the specific recognition of the major antigenic peptide from the Japanese cedar pollen allergen Cry j 1 by HLA-DP5
Journal of Molecular Biology 426:3016–3027.

https://doi.org/10.1016/j.jmb.2014.06.020
- PubMed
- Google Scholar
1. Lang HLE
2. Jacobsen H
3. Ikemizu S
4. Andersson C
5. Harlos K
6. Madsen L
7. Hjorth P
8. Sondergaard L
9. Svejgaard A
10. Wucherpfennig K
11. Stuart DI
12. Bell JI
13. Jones EY
14. Fugger L
(2002) A functional and structural basis for TCR cross-reactivity in multiple sclerosis
Nature Immunology 3:940–943.

https://doi.org/10.1038/ni835
- PubMed
- Google Scholar
1. Lawlor DA
2. Ward FE
3. Ennis PD
4. Jackson AP
5. Parham P
(1988) HLA-A and B polymorphisms predate the divergence of humans and chimpanzees
Nature 335:268–271.

https://doi.org/10.1038/335268a0
- PubMed
- Google Scholar
1. Lee SJ
2. Klein J
3. Haagenson M
4. Baxter-Lowe LA
5. Confer DL
6. Eapen M
7. Fernandez-Vina M
8. Flomenberg N
9. Horowitz M
10. Hurley CK
11. Noreen H
12. Oudshoorn M
13. Petersdorf E
14. Setterholm M
15. Spellman S
16. Weisdorf D
17. Williams TM
18. Anasetti C
(2007) High-resolution donor-recipient HLA matching contributes to the success of unrelated donor marrow transplantation
Blood 110:4576–4583.

https://doi.org/10.1182/blood-2007-06-097386
- PubMed
- Google Scholar
1. Leffler EM
2. Gao Z
3. Pfeifer S
4. Ségurel L
5. Auton A
6. Venn O
7. Bowden R
8. Bontrop R
9. Wall JD
10. Sella G
11. Donnelly P
12. McVean G
13. Przeworski M
(2013) Multiple instances of ancient balancing selection shared between humans and chimpanzees
Science 339:1578–1582.

https://doi.org/10.1126/science.1234070
- PubMed
- Google Scholar
1. Li Y
2. Li H
3. Martin R
4. Mariuzza RA
(2000) Structural basis for the binding of an immunodominant peptide from myelin basic protein in different registers by two HLA-DR2 proteins
Journal of Molecular Biology 304:177–188.

https://doi.org/10.1006/jmbi.2000.4198
- PubMed
- Google Scholar
1. Li L
2. Zhou X
3. Chen X
(2011) Characterization and evolution of MHC class II B genes in ardeid birds
Journal of Molecular Evolution 72:474–483.

https://doi.org/10.1007/s00239-011-9446-3
- Google Scholar
1. Li X
2. Singh NK
3. Collins DR
4. Ng R
5. Zhang A
6. Lamothe-Molina PA
7. Shahinian P
8. Xu S
9. Tan K
10. Piechocka-Trocha A
11. Urbach JM
12. Weber JK
13. Gaiha GD
14. Takou Mbah OC
15. Huynh T
16. Cheever S
17. Chen J
18. Birnbaum M
19. Zhou R
20. Walker BD
21. Wang J
(2023) Molecular basis of differential HLA class I-restricted T cell recognition of a highly networked HIV peptide
Nature Communications 14:38573.

https://doi.org/10.1038/s41467-023-38573-8
- Google Scholar
1. Lim Kam Sian TCC
2. Indumathy S
3. Halim H
4. Greule A
5. Cryle MJ
6. Bowness P
7. Rossjohn J
8. Gras S
9. Purcell AW
10. Schittenhelm RB
(2019) Allelic association with ankylosing spondylitis fails to correlate with human leukocyte antigen B27 homodimer formation
The Journal of Biological Chemistry 294:20185–20195.

https://doi.org/10.1074/jbc.RA119.010257
- PubMed
- Google Scholar
1. Liu J
2. Chen KY
3. Ren EC
(2011) Structural insights into the binding of hepatitis B virus core peptide to HLA-A2 alleles: towards designing better vaccines
European Journal of Immunology 41:2097–2106.

https://doi.org/10.1002/eji.201041370
- PubMed
- Google Scholar
1. Liu YC
2. Miles JJ
3. Neller MA
4. Gostick E
5. Price DA
6. Purcell AW
7. McCluskey J
8. Burrows SR
9. Rossjohn J
10. Gras S
(2013) Highly divergent T-cell receptor binding modes underlie specific recognition of a bulged viral peptide bound to a human leukocyte antigen class I molecule
The Journal of Biological Chemistry 288:15442–15454.

https://doi.org/10.1074/jbc.M112.447185
- PubMed
- Google Scholar
1. Liu YC
2. Chen Z
3. Neller MA
4. Miles JJ
5. Purcell AW
6. McCluskey J
7. Burrows SR
8. Rossjohn J
9. Gras S
(2014) A molecular basis for the interplay between T cells, viral mutants, and human leukocyte antigen micropolymorphism
The Journal of Biological Chemistry 289:16688–16698.

https://doi.org/10.1074/jbc.M114.563502
- PubMed
- Google Scholar
1. Liu C
(2021) A long road/read to rapid high-resolution HLA typing: The nanopore perspective
Human Immunology 82:488–495.

https://doi.org/10.1016/j.humimm.2020.04.009
- PubMed
- Google Scholar
1. Loisel DA
2. Rockman MV
3. Wray GA
4. Altmann J
5. Alberts SC
(2006) Ancient polymorphism and functional variation in the primate MHC-DQA1 5’ cis-regulatory region
PNAS 103:16331–16336.

https://doi.org/10.1073/pnas.0607662103
- PubMed
- Google Scholar
Data
(authors) (2020) Crystal structure of HLA-B2709 complexed with the nona-peptide ma
Worldwide Protein Data Bank.

https://doi.org/10.2210/pdb6Y27/pdb
Data
1. Lu D
2. Chen Y
3. Jiang M
(authors) (2023) Crystal structure of A TCR in complex with HLA-A*11:01 bound to KRAS peptide (VVGAVGVGK)
Worldwide Protein Data Bank.

https://doi.org/10.2210/pdb8I5C/pdb
1. Lugo JS
2. Cadavid LF
(2015) Patterns of MHC-G-like and MHC-B diversification in new world monkeys
PLOS ONE 10:e0131343.

https://doi.org/10.1371/journal.pone.0131343
- PubMed
- Google Scholar
1. Luo Y
2. Kanai M
3. Choi W
4. Li X
5. Sakaue S
6. Yamamoto K
7. Ogawa K
8. Gutierrez-Arcelus M
9. Gregersen PK
10. Stuart PE
11. Elder JT
12. Forer L
13. Schönherr S
14. Fuchsberger C
15. Smith AV
16. Fellay J
17. Carrington M
18. Haas DW
19. Guo X
20. Palmer ND
21. Chen YDI
22. Rotter JI
23. Taylor KD
24. Rich SS
25. Correa A
26. Wilson JG
27. Kathiresan S
28. Cho MH
29. Metspalu A
30. Esko T
31. Okada Y
32. Han B
33. McLaren PJ
34. Raychaudhuri S
35. NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium
(2021) Author Correction: A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response
Nature Genetics 53:1722.

https://doi.org/10.1038/s41588-021-00979-9
- PubMed
- Google Scholar
1. Maccari G
2. Robinson J
3. Ballingall K
4. Guethlein LA
5. Grimholt U
6. Kaufman J
7. Ho C-S
8. de Groot NG
9. Flicek P
10. Bontrop RE
11. Hammond JA
12. Marsh SGE
(2017) IPD-MHC 2.0: an improved inter-species database for the study of the major histocompatibility complex
Nucleic Acids Research 45:D860–D864.

https://doi.org/10.1093/nar/gkw1050
- PubMed
- Google Scholar
(2020) The IPD Project: a centralised resource for the study of polymorphism in genes of the immune system
Immunogenetics 72:49–55.

https://doi.org/10.1007/s00251-019-01133-w
- Google Scholar
1. Macdonald WA
2. Chen Z
3. Gras S
4. Archbold JK
5. Tynan FE
6. Clements CS
7. Bharadwaj M
8. Kjer-Nielsen L
9. Saunders PM
10. Wilce MCJ
11. Crawford F
12. Stadinsky B
13. Jackson D
14. Brooks AG
15. Purcell AW
16. Kappler JW
17. Burrows SR
18. Rossjohn J
19. McCluskey J
(2009) T cell allorecognition via molecular mimicry
Immunity 31:897–908.

https://doi.org/10.1016/j.immuni.2009.09.025
- Google Scholar
(2017) MHC class I diversity in chimpanzees and bonobos
Immunogenetics 69:661–676.

https://doi.org/10.1007/s00251-017-0990-x
- Google Scholar
1. Mallick S
2. Li H
3. Lipson M
4. Mathieson I
5. Gymrek M
6. Racimo F
7. Zhao M
8. Chennagiri N
9. Nordenfelt S
10. Tandon A
11. Skoglund P
12. Lazaridis I
13. Sankararaman S
14. Fu Q
15. Rohland N
16. Renaud G
17. Erlich Y
18. Willems T
19. Gallo C
20. Spence JP
21. Song YS
22. Poletti G
23. Balloux F
24. van Driem G
25. de Knijff P
26. Romero IG
27. Jha AR
28. Behar DM
29. Bravi CM
30. Capelli C
31. Hervig T
32. Moreno-Estrada A
33. Posukh OL
34. Balanovska E
35. Balanovsky O
36. Karachanak-Yankova S
37. Sahakyan H
38. Toncheva D
39. Yepiskoposyan L
40. Tyler-Smith C
41. Xue Y
42. Abdullah MS
43. Ruiz-Linares A
44. Beall CM
45. Di Rienzo A
46. Jeong C
47. Starikovskaya EB
48. Metspalu E
49. Parik J
50. Villems R
51. Henn BM
52. Hodoglugil U
53. Mahley R
54. Sajantila A
55. Stamatoyannopoulos G
56. Wee JTS
57. Khusainova R
58. Khusnutdinova E
59. Litvinov S
60. Ayodo G
61. Comas D
62. Hammer MF
63. Kivisild T
64. Klitz W
65. Winkler CA
66. Labuda D
67. Bamshad M
68. Jorde LB
69. Tishkoff SA
70. Watkins WS
71. Metspalu M
72. Dryomov S
73. Sukernik R
74. Singh L
75. Thangaraj K
76. Pääbo S
77. Kelso J
78. Patterson N
79. Reich D
(2016) The simons genome diversity project: 300 genomes from 142 diverse populations
Nature 538:201–206.

https://doi.org/10.1038/nature18964
- PubMed
- Google Scholar
1. Marsh SGE
2. Albert ED
3. Bodmer WF
4. Bontrop RE
5. Dupont B
6. Erlich HA
7. Fernández-Viña M
8. Geraghty DE
9. Holdsworth R
10. Hurley CK
11. Lau M
12. Lee KW
13. Mach B
14. Maiers M
15. Mayr WR
16. Müller CR
17. Parham P
18. Petersdorf EW
19. Sasazuki T
20. Strominger JL
21. Svejgaard A
22. Terasaki PI
23. Tiercy JM
24. Trowsdale J
(2010) Nomenclature for factors of the HLA system, 2010
Tissue Antigens 75:291–455.

https://doi.org/10.1111/j.1399-0039.2010.01466.x
- PubMed
- Google Scholar
1. Mathieson I
2. Lazaridis I
3. Rohland N
4. Mallick S
5. Patterson N
6. Roodenberg SA
7. Harney E
8. Stewardson K
9. Fernandes D
10. Novak M
11. Sirak K
12. Gamba C
13. Jones ER
14. Llamas B
15. Dryomov S
16. Pickrell J
17. Arsuaga JL
18. de Castro JMB
19. Carbonell E
20. Gerritsen F
21. Khokhlov A
22. Kuznetsov P
23. Lozano M
24. Meller H
25. Mochalov O
26. Moiseyev V
27. Guerra MAR
28. Roodenberg J
29. Vergès JM
30. Krause J
31. Cooper A
32. Alt KW
33. Brown D
34. Anthony D
35. Lalueza-Fox C
36. Haak W
37. Pinhasi R
38. Reich D
(2015) Genome-wide patterns of selection in 230 ancient Eurasians
Nature 528:499–503.

https://doi.org/10.1038/nature16152
- Google Scholar
1. Mayer WE
2. Jonker M
3. Klein D
4. Ivanyi P
5. van Seventer G
6. Klein J
(1988) Nucleotide sequences of chimpanzee MHC class I alleles: evidence for trans-species mode of evolution
The EMBO Journal 7:2765–2774.

https://doi.org/10.1002/j.1460-2075.1988.tb03131.x
- Google Scholar
(1992) Trans-species origin of Mhc-DRB polymorphism in the chimpanzee
Immunogenetics 37:12–23.

https://doi.org/10.1007/BF00223540
- PubMed
- Google Scholar
(1988) The origin of MHC class II gene polymorphism within the genus Mus
Nature 332:651–654.

https://doi.org/10.1038/332651a0
- PubMed
- Google Scholar
(1999) Taxonomic hierarchy of HLA class I allele sequences
Genes & Immunity 1:120–129.

https://doi.org/10.1038/sj.gene.6363648
- PubMed
- Google Scholar
1. McLaren PJ
2. Ripke S
3. Pelak K
4. Weintrob AC
5. Patsopoulos NA
6. Jia X
7. Erlich RL
8. Lennon NJ
9. Kadie CM
10. Heckerman D
11. Gupta N
12. Haas DW
13. Deeks SG
14. Pereyra F
15. Walker BD
16. de Bakker PIW
17. International HIV Controllers Study
(2012) Fine-mapping classical HLA variation associated with durable host control of HIV-1 infection in African Americans
Human Molecular Genetics 21:4334–4347.

https://doi.org/10.1093/hmg/dds226
- PubMed
- Google Scholar
1. Miller JD
2. Weber DA
3. Ibegbu C
4. Pohl J
5. Altman JD
6. Jensen PE
(2003) Analysis of HLA-E peptide-binding specificity and contact residues in bound peptide required for recognition by CD94/NKG2
Journal of Immunology 171:1369–1375.

https://doi.org/10.4049/jimmunol.171.3.1369
- PubMed
- Google Scholar
1. Mobbs JI
2. Illing PT
3. Dudek NL
4. Brooks AG
5. Baker DG
6. Purcell AW
7. Rossjohn J
8. Vivian JP
(2017) The molecular basis for peptide repertoire selection in the human leukocyte antigen (HLA) C*06:02 molecule
Journal of Biological Chemistry 292:17203–17215.

https://doi.org/10.1074/jbc.M117.806976
- Google Scholar
1. Molineros JE
2. Looger LL
3. Kim K
4. Okada Y
5. Terao C
6. Sun C
7. Zhou X-J
8. Raj P
9. Kochi Y
10. Suzuki A
11. Akizuki S
12. Nakabo S
13. Bang S-Y
14. Lee H-S
15. Kang YM
16. Suh C-H
17. Chung WT
18. Park Y-B
19. Choe J-Y
20. Shim S-C
21. Lee S-S
22. Zuo X
23. Yamamoto K
24. Li Q-Z
25. Shen N
26. Porter LL
27. Harley JB
28. Chua KH
29. Zhang H
30. Wakeland EK
31. Tsao BP
32. Bae S-C
33. Nath SK
(2019) Amino acid signatures of HLA Class-I and II molecules are strongly associated with SLE susceptibility and autoantibody production in Eastern Asians
PLOS Genetics 15:e1008092.

https://doi.org/10.1371/journal.pgen.1008092
- PubMed
- Google Scholar
1. Moradi S
2. Stankovic S
3. O’Connor GM
4. Pymm P
5. MacLachlan BJ
6. Faoro C
7. Retière C
8. Sullivan LC
9. Saunders PM
10. Widjaja J
11. Cox-Livingstone S
12. Rossjohn J
13. Brooks AG
14. Vivian JP
(2021) Structural plasticity of KIR2DL2 and KIR2DL3 enables altered docking geometries atop HLA-C
Nature Communications 12:2173.

https://doi.org/10.1038/s41467-021-22359-x
- PubMed
- Google Scholar
(1998) The structure of HLA-DM, the peptide exchange catalyst that loads antigen onto class II MHC molecules during antigen presentation
Immunity 9:377–383.

https://doi.org/10.1016/s1074-7613(00)80620-2
- PubMed
- Google Scholar
1. Motozono C
2. Kuse N
3. Sun X
4. Rizkallah PJ
5. Fuller A
6. Oka S
7. Cole DK
8. Sewell AK
9. Takiguchi M
(2014) Molecular basis of a dominant T cell response to an HIV reverse transcriptase 8-mer epitope presented by the protective allele HLA-B*51:01
Journal of Immunology 192:3428–3434.

https://doi.org/10.4049/jimmunol.1302667
- PubMed
- Google Scholar
1. Müller NF
2. Bouckaert RR
(2020) Adaptive Metropolis-coupled MCMC for BEAST 2
PeerJ 8:e9473.

https://doi.org/10.7717/peerj.9473
- PubMed
- Google Scholar
1. Naito T
2. Suzuki K
3. Hirata J
4. Kamatani Y
5. Matsuda K
6. Toda T
7. Okada Y
(2021) A deep learning method for HLA imputation and trans-ethnic MHC fine-mapping of type 1 diabetes
Nature Communications 12:1639.

https://doi.org/10.1038/s41467-021-21975-x
- PubMed
- Google Scholar
1. Neefjes J
2. Jongsma MLM
3. Paul P
4. Bakke O
(2011) Towards a systems understanding of MHC class I and MHC class II antigen presentation
Nature Reviews. Immunology 11:823–836.

https://doi.org/10.1038/nri3084
- PubMed
- Google Scholar
Book
1. Nei M
2. Hughes AL
(1991)
Polymorphism and evolution of the major histocompatibility complex loci in mammals

In: Selander R, Clark A, Whittam T, editors. Evolution at the Molecular Level. Sinauer Associates, Inc. pp. 222–247.
- Google Scholar
1. Newman RM
2. Hall L
3. Connole M
4. Chen G-L
5. Sato S
6. Yuste E
7. Diehl W
8. Hunter E
9. Kaur A
10. Miller GM
11. Johnson WE
(2006) Balancing selection and the evolution of functional polymorphism in Old World monkey TRIM5alpha
PNAS 103:19134–19139.

https://doi.org/10.1073/pnas.0605838103
- PubMed
- Google Scholar
(2017) Unraveling the structural basis for the unusually rich association of human leukocyte antigen DQ2.5 with class-II-associated invariant chain peptides
The Journal of Biological Chemistry 292:9218–9228.

https://doi.org/10.1074/jbc.M117.785139
- PubMed
- Google Scholar
1. Nicholson MJ
2. Moradi B
3. Seth NP
4. Xing X
5. Cuny GD
6. Stein RL
7. Wucherpfennig KW
(2006) Small molecules that enhance the catalytic efficiency of HLA-DM
Journal of Immunology 176:4208–4220.

https://doi.org/10.4049/jimmunol.176.7.4208
- PubMed
- Google Scholar
1. Nielsen M
2. Lundegaard C
3. Blicher T
4. Lamberth K
5. Harndahl M
6. Justesen S
7. Røder G
8. Peters B
9. Sette A
10. Lund O
11. Buus S
(2007) NetMHCpan, a method for quantitative predictions of peptide binding to any HLA-A and -B locus protein of known sequence
PLOS ONE 2:e796.

https://doi.org/10.1371/journal.pone.0000796
- PubMed
- Google Scholar
1. Niu L
2. Cheng H
3. Zhang S
4. Tan S
5. Zhang Y
6. Qi J
7. Liu J
8. Gao GF
(2013) Structural basis for the differential classification of HLA-A*6802 and HLA-A*6801 into the A2 and A3 supertypes
Molecular Immunology 55:381–392.

https://doi.org/10.1016/j.molimm.2013.03.015
- PubMed
- Google Scholar
1. Nunes K
2. Maia MHT
3. Dos Santos EJM
4. Dos Santos SEB
5. Guerreiro JF
6. Petzl-Erler ML
7. Bedoya G
8. Gallo C
9. Poletti G
10. Llop E
11. Tsuneto L
12. Bortolini MC
13. Rothhammer F
14. Single R
15. Ruiz-Linares A
16. Rocha J
17. Meyer D
(2021) How natural selection shapes genetic differentiation in the MHC region: A case study with Native Americans
Human Immunology 82:523–531.

https://doi.org/10.1016/j.humimm.2021.03.005
- PubMed
- Google Scholar
1. Okada Y
2. Momozawa Y
3. Sakaue S
4. Kanai M
5. Ishigaki K
6. Akiyama M
7. Kishikawa T
8. Arai Y
9. Sasaki T
10. Kosaki K
11. Suematsu M
12. Matsuda K
13. Yamamoto K
14. Kubo M
15. Hirose N
16. Kamatani Y
(2018) Deep whole-genome sequencing reveals recent selection signatures linked to evolution and disease risk of Japanese
Nature Communications 9:1631.

https://doi.org/10.1038/s41467-018-03274-0
- PubMed
- Google Scholar
1. O’Leary NA
2. Wright MW
3. Brister JR
4. Ciufo S
5. Haddad D
6. McVeigh R
7. Rajput B
8. Robbertse B
9. Smith-White B
10. Ako-Adjei D
11. Astashyn A
12. Badretdin A
13. Bao Y
14. Blinkova O
15. Brover V
16. Chetvernin V
17. Choi J
18. Cox E
19. Ermolaeva O
20. Farrell CM
21. Goldfarb T
22. Gupta T
23. Haft D
24. Hatcher E
25. Hlavina W
26. Joardar VS
27. Kodali VK
28. Li W
29. Maglott D
30. Masterson P
31. McGarvey KM
32. Murphy MR
33. O’Neill K
34. Pujar S
35. Rangwala SH
36. Rausch D
37. Riddick LD
38. Schoch C
39. Shkeda A
40. Storz SS
41. Sun H
42. Thibaud-Nissen F
43. Tolstoy I
44. Tully RE
45. Vatsan AR
46. Wallin C
47. Webb D
48. Wu W
49. Landrum MJ
50. Kimchi A
51. Tatusova T
52. DiCuccio M
53. Kitts P
54. Murphy TD
55. Pruitt KD
(2016) Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
Nucleic Acids Research 44:D733–D45.

https://doi.org/10.1093/nar/gkv1189
- PubMed
- Google Scholar
1. Ooi JD
2. Petersen J
3. Tan YH
4. Huynh M
5. Willett ZJ
6. Ramarathinam SH
7. Eggenhuizen PJ
8. Loh KL
9. Watson KA
10. Gan PY
11. Alikhan MA
12. Dudek NL
13. Handel A
14. Hudson BG
15. Fugger L
16. Power DA
17. Holt SG
18. Coates PT
19. Gregersen JW
20. Purcell AW
21. Holdsworth SR
22. La Gruta NL
23. Reid HH
24. Rossjohn J
25. Kitching AR
(2017) Dominant protection from HLA-linked autoimmunity by antigen-specific regulatory T cells
Nature 545:243–247.

https://doi.org/10.1038/nature22329
- PubMed
- Google Scholar
(1992) Mhc-DQB repertoire variation in hominoid and Old World primate species
Journal of Immunology 149:461–470.

https://doi.org/10.4049/jimmunol.149.2.461
- PubMed
- Google Scholar
1. Otting N
2. Bontrop RE
(1995) Evolution of the major histocompatibility complex DPA1 locus in primates
Human Immunology 42:184–187.

https://doi.org/10.1016/0198-8859(94)00095-8
- Google Scholar
(2000) Allelic diversity of Mhc-DRB alleles in rhesus macaques
Tissue Antigens 56:58–68.

https://doi.org/10.1034/j.1399-0039.2000.560108.x
- PubMed
- Google Scholar
(2002) Extensive Mhc-DQB variation in humans and non-human primate species
Immunogenetics 54:230–239.

https://doi.org/10.1007/s00251-002-0461-9
- PubMed
- Google Scholar
(2019) HLAIb worldwide genetic diversity: New HLA-H alleles and haplotype structure description
Molecular Immunology 112:40–50.

https://doi.org/10.1016/j.molimm.2019.04.017
- PubMed
- Google Scholar
(2007) Crystallographic structure of the human leukocyte antigen DRA, DRB3*0101: Models of a directional alloimmune response and autoimmunity
Journal of Molecular Biology 371:435–446.

https://doi.org/10.1016/j.jmb.2007.05.025
- Google Scholar
1. Petersen J
2. Kooy-Winkelaar Y
3. Loh KL
4. Tran M
5. van Bergen J
6. Koning F
7. Rossjohn J
8. Reid HH
(2016) Diverse T cell receptor gene usage in HLA-DQ8-associated celiac disease converges into a consensus binding solution
Structure 24:1643–1657.

https://doi.org/10.1016/j.str.2016.07.010
- Google Scholar
1. Petrie EJ
2. Clements CS
3. Lin J
4. Sullivan LC
5. Johnson D
6. Huyton T
7. Heroux A
8. Hoare HL
9. Beddoe T
10. Reid HH
11. Wilce MCJ
12. Brooks AG
13. Rossjohn J
(2008) CD94-NKG2A recognition of human leukocyte antigen (HLA)-E bound to an HLA class I leader sequence
The Journal of Experimental Medicine 205:725–735.

https://doi.org/10.1084/jem.20072525
- PubMed
- Google Scholar
1. Piontkivska H
(2003) Birth-and-death evolution in primate MHC class I genes: divergence time estimates
Molecular Biology and Evolution 20:601–609.

https://doi.org/10.1093/molbev/msg064
- Google Scholar
1. Pos W
2. Sethi DK
3. Call MJ
4. Schulze M
5. Anders AK
6. Pyrdol J
7. Wucherpfennig KW
(2012) Crystal structure of the HLA-DM-HLA-DR1 complex defines mechanisms for rapid peptide selection
Cell 151:1557–1568.

https://doi.org/10.1016/j.cell.2012.11.025
- PubMed
- Google Scholar
(2021) The maintenance of polymorphism in an ancient social supergene
Molecular Ecology 30:6246–6258.

https://doi.org/10.1111/mec.16196
- PubMed
- Google Scholar
1. Racle J
2. Guillaume P
3. Schmidt J
4. Michaux J
5. Larabi A
6. Lau K
7. Perez MAS
8. Croce G
9. Genolet R
10. Coukos G
11. Zoete V
12. Pojer F
13. Bassani-Sternberg M
14. Harari A
15. Gfeller D
(2023) Machine learning predictions of MHC-II specificities reveal alternative binding mode of class II epitopes
Immunity 56:1359–1375.

https://doi.org/10.1016/j.immuni.2023.03.009
- PubMed
- Google Scholar
1. Radwan J
2. Babik W
3. Kaufman J
4. Lenz TL
5. Winternitz J
(2020) Advances in the evolutionary understanding of MHC polymorphism
Trends in Genetics 36:298–311.

https://doi.org/10.1016/j.tig.2020.01.008
- PubMed
- Google Scholar
1. Rambaut A
2. Drummond AJ
3. Xie D
4. Baele G
5. Suchard MA
(2018) Posterior summarization in bayesian phylogenetics using tracer 1.7
Systematic Biology 67:901–904.

https://doi.org/10.1093/sysbio/syy032
- PubMed
- Google Scholar
1. Raychaudhuri S
2. Sandor C
3. Stahl EA
4. Freudenberg J
5. Lee HS
6. Jia X
7. Alfredsson L
8. Padyukov L
9. Klareskog L
10. Worthington J
11. Siminovitch KA
12. Bae SC
13. Plenge RM
14. Gregersen PK
15. de Bakker PIW
(2012) Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis
Nature Genetics 44:291–296.

https://doi.org/10.1038/ng.1076
- PubMed
- Google Scholar
1. Rist MJ
2. Theodossis A
3. Croft NP
4. Neller MA
5. Welland A
6. Chen Z
7. Sullivan LC
8. Burrows JM
9. Miles JJ
10. Brennan RM
11. Gras S
12. Khanna R
13. Brooks AG
14. McCluskey J
15. Purcell AW
16. Rossjohn J
17. Burrows SR
(2013) HLA peptide length preferences control CD8+ T cell responses
Journal of Immunology 191:561–571.

https://doi.org/10.4049/jimmunol.1300292
- PubMed
- Google Scholar
1. Robinson J
2. Barker DJ
3. Georgiou X
4. Cooper MA
5. Flicek P
6. Marsh SGE
(2019) IPD-IMGT/HLA Database
Nucleic Acids Research 48:D948–D955.

https://doi.org/10.1093/nar/gkz950
- Google Scholar
(2024) 25 years of the IPD-IMGT/HLA Database
HLA 103:e15549.

https://doi.org/10.1111/tan.15549
- PubMed
- Google Scholar
1. Sachidanandam R
2. Weissman D
3. Schmidt SC
4. Kakol JM
5. Stein LD
6. Marth G
7. Sherry S
8. Mullikin JC
9. Mortimore BJ
10. Willey DL
11. Hunt SE
12. Cole CG
13. Coggill PC
14. Rice CM
15. Ning Z
16. Rogers J
17. Bentley DR
18. Kwok PY
19. Mardis ER
20. Yeh RT
21. Schultz B
22. Cook L
23. Davenport R
24. Dante M
25. Fulton L
26. Hillier L
27. Waterston RH
28. McPherson JD
29. Gilman B
30. Schaffner S
31. Van Etten WJ
32. Reich D
33. Higgins J
34. Daly MJ
35. Blumenstiel B
36. Baldwin J
37. Stange-Thomann N
38. Zody MC
39. Linton L
40. Lander ES
41. Altshuler D
42. International SNP Map Working Group
(2001) A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms
Nature 409:928–933.

https://doi.org/10.1038/35057149
- PubMed
- Google Scholar
1. Sakaue S
2. Kanai M
3. Tanigawa Y
4. Karjalainen J
5. Kurki M
6. Koshiba S
7. Narita A
8. Konuma T
9. Yamamoto K
10. Akiyama M
11. Ishigaki K
12. Suzuki A
13. Suzuki K
14. Obara W
15. Yamaji K
16. Takahashi K
17. Asai S
18. Takahashi Y
19. Suzuki T
20. Shinozaki N
21. Yamaguchi H
22. Minami S
23. Murayama S
24. Yoshimori K
25. Nagayama S
26. Obata D
27. Higashiyama M
28. Masumoto A
29. Koretsune Y
30. Ito K
31. Terao C
32. Yamauchi T
33. Komuro I
34. Kadowaki T
35. Tamiya G
36. Yamamoto M
37. Nakamura Y
38. Kubo M
39. Murakami Y
40. Yamamoto K
41. Kamatani Y
42. Palotie A
43. Rivas MA
44. Daly MJ
45. Matsuda K
46. Okada Y
47. FinnGen
(2021) A cross-population atlas of genetic associations for 220 human phenotypes
Nature Genetics 53:1415–1424.

https://doi.org/10.1038/s41588-021-00931-x
- PubMed
- Google Scholar
1. Sano EB
2. Wall CA
3. Hutchins PR
4. Miller SR
(2018) Ancient balancing selection on heterocyst function in a cosmopolitan cyanobacterium
Nature Ecology & Evolution 2:510–519.

https://doi.org/10.1038/s41559-017-0435-9
- Google Scholar
1. Satta Y
2. Mayer WE
3. Klein J
(1996) Evolutionary relationship ofHLA-DRB genes inferred from intron sequences
Journal of Molecular Evolution 42:648–657.

https://doi.org/10.1007/BF02338798
- Google Scholar
1. Saunders PM
2. Vivian JP
3. Baschuk N
4. Beddoe T
5. Widjaja J
6. O’Connor GM
7. Hitchen C
8. Pymm P
9. Andrews DM
10. Gras S
11. McVicar DW
12. Rossjohn J
13. Brooks AG
(2015) The Interaction of KIR3DL1*001 with HLA Class I molecules is dependent upon molecular microarchitecture within the Bw4 epitope
The Journal of Immunology 194:781–789.

https://doi.org/10.4049/jimmunol.1402542
- Google Scholar
Book
1. Sawyer SA
(1999)
GENECONV: A Computer Package for the Statistical Detection of Gene Conversion

ScienceOpen.
- Google Scholar
1. Scally SW
2. Law SC
3. Ting YT
4. van Heemst J
5. Sokolove J
6. Deutsch AJ
7. Bridie Clemens E
8. Moustakas AK
9. Papadopoulos GK
10. van der Woude D
11. Smolik I
12. Hitchon CA
13. Robinson DB
14. Ferucci ED
15. Bernstein CN
16. Meng X
17. Anaparti V
18. Huizinga T
19. Kedzierska K
20. Reid HH
21. Raychaudhuri S
22. Toes RE
23. Rossjohn J
24. El-Gabalawy H
25. Thomas R
(2017) Molecular basis for increased susceptibility of Indigenous North Americans to seropositive rheumatoid arthritis
Annals of the Rheumatic Diseases 76:1915–1923.

https://doi.org/10.1136/annrheumdis-2017-211300
- PubMed
- Google Scholar
Software
1. Schrödinger, LLC
(2021) The pymol molecular graphics system, version v.2.4.2
PyMOL.

http://www.pymol.org/
1. Schulze M-SE
2. Wucherpfennig KW
(2012) The mechanism of HLA-DM induced peptide exchange in the MHC class II antigen presentation pathway
Current Opinion in Immunology 24:105–111.

https://doi.org/10.1016/j.coi.2011.11.004
- Google Scholar
Data
1. Schutte R
2. Li D
3. Ostrov D
(authors) (2020) HLA-b*15:01 complexed with a synthetic peptide
Worldwide Protein Data Bank.

https://doi.org/10.2210/pdb6UZP/pdb
1. Ségurel L
2. Thompson EE
3. Flutre T
4. Lovstad J
5. Venkat A
6. Margulis SW
7. Moyse J
8. Ross S
9. Gamble K
10. Sella G
11. Ober C
12. Przeworski M
(2012) The ABO blood group is a trans-species polymorphism in primates
PNAS 109:18493–18498.

https://doi.org/10.1073/pnas.1210603109
- PubMed
- Google Scholar
1. Sharon E
2. Sibener LV
3. Battle A
4. Fraser HB
5. Garcia KC
6. Pritchard JK
(2016) Genetic variation in MHC proteins is associated with T cell receptor expression biases
Nature Genetics 48:995–1002.

https://doi.org/10.1038/ng.3625
- Google Scholar
1. Shiroishi M
2. Kuroki K
3. Rasubala L
4. Tsumoto K
5. Kumagai I
6. Kurimoto E
7. Kato K
8. Kohda D
9. Maenaka K
(2006) Structural basis for recognition of the nonclassical MHC molecule HLA-G by the leukocyte Ig-like receptor B2 (LILRB2/LIR2/ILT4/CD85d)
PNAS 103:16412–16417.

https://doi.org/10.1073/pnas.0605228103
- PubMed
- Google Scholar
1. Simons ND
2. Eick GN
3. Ruiz-Lopez MJ
4. Omeja PA
5. Chapman CA
6. Goldberg TL
7. Ting N
8. Sterner KN
(2017) Cis-regulatory evolution in a wild primate: Infection-associated genetic variation drives differential expression of MHC-DQA1 in vitro
Molecular Ecology 26:4523–4535.

https://doi.org/10.1111/mec.14221
- PubMed
- Google Scholar
(1992) Evolutionary stability of transspecies major histocompatibility complex class II DRB lineages in humans and rhesus monkeys
Human Immunology 35:29–39.

https://doi.org/10.1016/0198-8859(92)90092-2
- PubMed
- Google Scholar
(1995) Allelic diversity at the Mhc-DP locus in rhesus macaques (Macaca mulatta)
Immunogenetics 41:29–37.

https://doi.org/10.1007/BF00188429
- PubMed
- Google Scholar
(1998) Crystal structure of HLA-DR2 (DRA*0101, DRB1*1501) complexed with a peptide from human myelin basic protein
The Journal of Experimental Medicine 188:1511–1520.

https://doi.org/10.1084/jem.188.8.1511
- PubMed
- Google Scholar
Preprint
1. Smith CJ
2. Strausz S
3. Spence JP
4. Ollila HM
5. Pritchard JK
6. FinnGen
(2024) Haplotype Analysis Reveals Pleiotropic Disease Associations in the HLA Region
medRxiv.

https://doi.org/10.1101/2024.07.29.24311183
- Google Scholar
1. Stuart PE
2. Tsoi LC
3. Nair RP
4. Ghosh M
5. Kabra M
6. Shaiq PA
7. Raja GK
8. Qamar R
9. Thelma BK
10. Patrick MT
11. Parihar A
12. Singh S
13. Khandpur S
14. Kumar U
15. Wittig M
16. Degenhardt F
17. Tejasvi T
18. Voorhees JJ
19. Weidinger S
20. Franke A
21. Abecasis GR
22. Sharma VK
23. Elder JT
(2022) Transethnic analysis of psoriasis susceptibility in South Asians and Europeans enhances fine-mapping in the MHC and genomewide
HGG Advances 3:100069.

https://doi.org/10.1016/j.xhgg.2021.100069
- PubMed
- Google Scholar
1. Sullivan LC
2. Walpole NG
3. Farenc C
4. Pietra G
5. Sum MJW
6. Clements CS
7. Lee EJ
8. Beddoe T
9. Falco M
10. Mingari MC
11. Moretta L
12. Gras S
13. Rossjohn J
14. Brooks AG
(2017) A conserved energetic footprint underpins recognition of human leukocyte antigen-E by two distinct αβ T cell receptors
The Journal of Biological Chemistry 292:21149–21158.

https://doi.org/10.1074/jbc.M117.807719
- PubMed
- Google Scholar
1. Sun M
2. Liu J
3. Qi J
4. Tefsen B
5. Shi Y
6. Yan J
7. Gao GF
(2014) Nα-terminal acetylation for T cell recognition: molecular basis of MHC class I-restricted nα-acetylpeptide presentation
Journal of Immunology 192:5509–5519.

https://doi.org/10.4049/jimmunol.1400199
- PubMed
- Google Scholar
1. Teixeira JC
2. de Filippo C
3. Weihmann A
4. Meneu JR
5. Racimo F
6. Dannemann M
7. Nickel B
8. Fischer A
9. Halbwax M
10. Andre C
11. Atencia R
12. Meyer M
13. Parra G
14. Pääbo S
15. Andrés AM
(2015) Long-term balancing selection in LAD1 maintains a missense trans-species polymorphism in humans, Chimpanzees, and Bonobos
Molecular Biology and Evolution 32:1186–1196.

https://doi.org/10.1093/molbev/msv007
- PubMed
- Google Scholar
1. Teze D
2. Hendrickx J
3. Czjzek M
4. Ropartz D
5. Sanejouand YH
6. Tran V
7. Tellier C
8. Dion M
(2014) Semi-rational approach for converting a GH1 β-glycosidase into a β-transglycosidase
Protein Engineering, Design & Selection 27:13–19.

https://doi.org/10.1093/protein/gzt057
- PubMed
- Google Scholar
1. Tian C
2. Hromatka BS
3. Kiefer AK
4. Eriksson N
5. Noble SM
6. Tung JY
7. Hinds DA
(2017) Genome-wide association and HLA region fine-mapping studies identify susceptibility loci for multiple common infections
Nature Communications 8:599.

https://doi.org/10.1038/s41467-017-00257-5
- PubMed
- Google Scholar
1. Ting YT
2. Dahal-Koirala S
3. Kim HSK
4. Qiao S-W
5. Neumann RS
6. Lundin KEA
7. Petersen J
8. Reid HH
9. Sollid LM
10. Rossjohn J
(2020) A molecular basis for the T cell response in HLA-DQ2.2 mediated celiac disease
PNAS 117:3063–3073.

https://doi.org/10.1073/pnas.1914308117
- PubMed
- Google Scholar
1. Tollefsen S
2. Hotta K
3. Chen X
4. Simonsen B
5. Swaminathan K
6. Mathews II
7. Sollid LM
8. Kim CY
(2012) Structural and functional studies of trans-encoded HLA-DQ2.3 (DQA1*03:01/DQB1*02:01) protein molecule
The Journal of Biological Chemistry 287:13611–13619.

https://doi.org/10.1074/jbc.M111.320374
- PubMed
- Google Scholar
1. van de Sandt CE
2. Clemens EB
3. Grant EJ
4. Rowntree LC
5. Sant S
6. Halim H
7. Crowe J
8. Cheng AC
9. Kotsimbos TC
10. Richards M
11. Miller A
12. Tong SYC
13. Rossjohn J
14. Nguyen THO
15. Gras S
16. Chen W
17. Kedzierska K
(2019) Challenging immunodominance of influenza-specific CD8⁺ T cell responses restricted by the risk-associated HLA-A*68:01 allomorph
Nature Communications 10:5579.

https://doi.org/10.1038/s41467-019-13346-4
- PubMed
- Google Scholar
Software
1. Vaughan T
2. Xie W
3. Wu J
(2018) SubstBMA, version dafa621
GitHub.

https://github.com/jessiewu/substBMA
1. Viļuma A
2. Mikko S
3. Hahn D
4. Skow L
5. Andersson G
6. Bergström TF
(2017) Genomic structure of the horse major histocompatibility complex class II region resolved using PacBio long-read sequencing technology
Scientific Reports 7:45518.

https://doi.org/10.1038/srep45518
- PubMed
- Google Scholar
Data
1. Vivian J
2. Rossjohn J
(authors) (2022) HLA-b*27:05 in complex with the pan-HLA-ia monoclonal antibody W6/32
Worldwide Protein Data Bank.

https://doi.org/10.2210/pdb7t0l/pdb
1. Vizcaíno JA
2. Kubiniok P
3. Kovalchik KA
4. Ma Q
5. Duquette JD
6. Mongrain I
7. Deutsch EW
8. Peters B
9. Sette A
10. Sirois I
11. Caron E
(2020) The human immunopeptidome project: a roadmap to predict and treat immune diseases
Molecular & Cellular Proteomics 19:31–49.

https://doi.org/10.1074/mcp.R119.001743
- Google Scholar
(2021) The new kid on the block: HLA-C, a key regulator of natural killer cells in viral immunity
Cells 10:3108.

https://doi.org/10.3390/cells10113108
- PubMed
- Google Scholar
1. Waage J
2. Standl M
3. Curtin JA
4. Jessen LE
5. Thorsen J
6. Tian C
7. Schoettler N
8. Flores C
9. Abdellaoui A
10. Ahluwalia TS
11. Alves AC
12. Amaral AFS
13. Antó JM
14. Arnold A
15. Barreto-Luis A
16. Baurecht H
17. van Beijsterveldt CEM
18. Bleecker ER
19. Bonàs-Guarch S
20. Boomsma DI
21. Brix S
22. Bunyavanich S
23. Burchard EG
24. Chen Z
25. Curjuric I
26. Custovic A
27. den Dekker HT
28. Dharmage SC
29. Dmitrieva J
30. Duijts L
31. Ege MJ
32. Gauderman WJ
33. Georges M
34. Gieger C
35. Gilliland F
36. Granell R
37. Gui H
38. Hansen T
39. Heinrich J
40. Henderson J
41. Hernandez-Pacheco N
42. Holt P
43. Imboden M
44. Jaddoe VWV
45. Jarvelin M-R
46. Jarvis DL
47. Jensen KK
48. Jónsdóttir I
49. Kabesch M
50. Kaprio J
51. Kumar A
52. Lee Y-A
53. Levin AM
54. Li X
55. Lorenzo-Diaz F
56. Melén E
57. Mercader JM
58. Meyers DA
59. Myers R
60. Nicolae DL
61. Nohr EA
62. Palviainen T
63. Paternoster L
64. Pennell CE
65. Pershagen G
66. Pino-Yanes M
67. Probst-Hensch NM
68. Rüschendorf F
69. Simpson A
70. Stefansson K
71. Sunyer J
72. Sveinbjornsson G
73. Thiering E
74. Thompson PJ
75. Torrent M
76. Torrents D
77. Tung JY
78. Wang CA
79. Weidinger S
80. Weiss S
81. Willemsen G
82. Williams LK
83. Ober C
84. Hinds DA
85. Ferreira MA
86. Bisgaard H
87. Strachan DP
88. Bønnelykke K
89. 23andMe Research Team
90. AAGC collaborators
(2018) Genome-wide association and HLA fine-mapping studies identify risk loci and genetic pathways underlying allergic rhinitis
Nature Genetics 50:1072–1080.

https://doi.org/10.1038/s41588-018-0157-1
- PubMed
- Google Scholar
Book
(1987) The evolution of MHC class II genes within the genus mus
In: David CS, editors. H-2 Antigens: Genes, Molecules, Function. Springer. pp. 139–153.

https://doi.org/10.1007/978-1-4757-0764-9_14
- Google Scholar
(2010) The structure and stability of the monomorphic HLA-G are influenced by the nature of the bound peptide
Journal of Molecular Biology 397:467–480.

https://doi.org/10.1016/j.jmb.2010.01.052
- PubMed
- Google Scholar
1. Walters LC
2. Rozbesky D
3. Harlos K
4. Quastel M
5. Sun H
6. Springer S
7. Rambo RP
8. Mohammed F
9. Jones EY
10. McMichael AJ
11. Gillespie GM
(2022) Primary and secondary functions of HLA-E are determined by stability and conformation of the peptide-bound complexes
Cell Reports 39:110959.

https://doi.org/10.1016/j.celrep.2022.110959
- PubMed
- Google Scholar
1. Wenger AM
2. Peluso P
3. Rowell WJ
4. Chang P-C
5. Hall RJ
6. Concepcion GT
7. Ebler J
8. Fungtammasan A
9. Kolesnikov A
10. Olson ND
11. Töpfer A
12. Alonge M
13. Mahmoud M
14. Qian Y
15. Chin C-S
16. Phillippy AM
17. Schatz MC
18. Myers G
19. DePristo MA
20. Ruan J
21. Marschall T
22. Sedlazeck FJ
23. Zook JM
24. Li H
25. Koren S
26. Carroll A
27. Rank DR
28. Hunkapiller MW
(2019) Accurate circular consensus long-read sequencing improves variant detection and assembly of a human genome
Nature Biotechnology 37:1155–1162.

https://doi.org/10.1038/s41587-019-0217-9
- PubMed
- Google Scholar
1. Wroblewski EE
2. Guethlein LA
3. Norman PJ
4. Li Y
5. Shaw CM
6. Han AS
7. Ndjango JBN
8. Ahuka-Mundeke S
9. Georgiev AV
10. Peeters M
11. Hahn BH
12. Parham P
(2017) Bonobos maintain immune system diversity with three functional types of MHC-B
Journal of Immunology 198:3480–3493.

https://doi.org/10.4049/jimmunol.1601955
- PubMed
- Google Scholar
1. Wu Y
2. Gao F
3. Liu J
4. Qi J
5. Gostick E
6. Price DA
7. Gao GF
(2011) Structural Basis of diverse peptide accommodation by the rhesus Macaque MHC Class I Molecule Mamu-B*17: Insights into immune protection from simian immunodeficiency virus
The Journal of Immunology 187:6382–6392.

https://doi.org/10.4049/jimmunol.1101726
- Google Scholar
(2013) Bayesian selection of nucleotide substitution models and their site assignments
Molecular Biology and Evolution 30:669–688.

https://doi.org/10.1093/molbev/mss258
- PubMed
- Google Scholar
1. Xu SX
2. Ren WH
3. Li SZ
4. Wei FW
5. Zhou KY
6. Yang G
(2009) Sequence polymorphism and evolution of three cetacean MHC genes
Journal of Molecular Evolution 69:260–275.

https://doi.org/10.1007/s00239-009-9272-z
- PubMed
- Google Scholar
1. Yagita Y
2. Kuse N
3. Kuroki K
4. Gatanaga H
5. Carlson JM
6. Chikata T
7. Brumme ZL
8. Murakoshi H
9. Akahoshi T
10. Pfeifer N
11. Mallal S
12. John M
13. Ose T
14. Matsubara H
15. Kanda R
16. Fukunaga Y
17. Honda K
18. Kawashima Y
19. Ariumi Y
20. Oka S
21. Maenaka K
22. Takiguchi M
(2013) Distinct HIV-1 escape patterns selected by cytotoxic T cells with identical epitope specificity
Journal of Virology 87:2253–2263.

https://doi.org/10.1128/JVI.02572-12
- PubMed
- Google Scholar
1. Yamamoto Y
2. Morita D
3. Shima Y
4. Midorikawa A
5. Mizutani T
6. Suzuki J
7. Mori N
8. Shiina T
9. Inoko H
10. Tanaka Y
11. Mikami B
12. Sugita M
(2019) Identification and Structure of an MHC Class I-encoded protein with the potential to present N-Myristoylated 4-mer Peptides to T Cells
Journal of Immunology 202:3349–3358.

https://doi.org/10.4049/jimmunol.1900087
- PubMed
- Google Scholar
1. Yasumizu Y
2. Sakaue S
3. Konuma T
4. Suzuki K
5. Matsuda K
6. Murakami Y
7. Kubo M
8. Palamara PF
9. Kamatani Y
10. Okada Y
(2020) Genome-wide natural selection signatures are linked to genetic risk of modern phenotypes in the Japanese population
Molecular Biology and Evolution 37:1306–1316.

https://doi.org/10.1093/molbev/msaa005
- PubMed
- Google Scholar
1. Yin L
2. Crawford F
3. Marrack P
4. Kappler JW
5. Dai S
(2012) T-cell receptor (TCR) interaction with peptides that mimic nickel offers insight into nickel contact allergy
PNAS 109:18517–18522.

https://doi.org/10.1073/pnas.1215928109
- PubMed
- Google Scholar
(2004) A polymorphic pocket at the P10 position contributes to peptide binding specificity in class II MHC proteins
Chemistry & Biology 11:1395–1402.

https://doi.org/10.1016/j.chembiol.2004.08.007
- Google Scholar
1. Zhang S
2. Liu J
3. Cheng H
4. Tan S
5. Qi J
6. Yan J
7. Gao GF
(2011) Structural basis of cross-allele presentation by HLA-A*0301 and HLA-A*1101 revealed by two HIV-derived peptide complexes
Molecular Immunology 49:395–401.

https://doi.org/10.1016/j.molimm.2011.08.015
- PubMed
- Google Scholar
1. Zhao M
2. Wang Y
3. Shen H
4. Li C
5. Chen C
6. Luo Z
7. Wu H
(2013) Evolution by selection, recombination, and gene duplication in MHC class I genes of two Rhacophoridae species
BMC Evolutionary Biology 13:113.

https://doi.org/10.1186/1471-2148-13-113
- PubMed
- Google Scholar
1. Zhou F
2. Cao H
3. Zuo X
4. Zhang T
5. Zhang X
6. Liu X
7. Xu R
8. Chen G
9. Zhang Y
10. Zheng X
11. Jin X
12. Gao J
13. Mei J
14. Sheng Y
15. Li Q
16. Liang B
17. Shen J
18. Shen C
19. Jiang H
20. Zhu C
21. Fan X
22. Xu F
23. Yue M
24. Yin X
25. Ye C
26. Zhang C
27. Liu X
28. Yu L
29. Wu J
30. Chen M
31. Zhuang X
32. Tang L
33. Shao H
34. Wu L
35. Li J
36. Xu Y
37. Zhang Y
38. Zhao S
39. Wang Y
40. Li G
41. Xu H
42. Zeng L
43. Wang J
44. Bai M
45. Chen Y
46. Chen W
47. Kang T
48. Wu Y
49. Xu X
50. Zhu Z
51. Cui Y
52. Wang Z
53. Yang C
54. Wang P
55. Xiang L
56. Chen X
57. Zhang A
58. Gao X
59. Zhang F
60. Xu J
61. Zheng M
62. Zheng J
63. Zhang J
64. Yu X
65. Li Y
66. Yang S
67. Yang H
68. Wang J
69. Liu J
70. Hammarström L
71. Sun L
72. Wang J
73. Zhang X
(2016) Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease
Nature Genetics 48:740–746.

https://doi.org/10.1038/ng.3576
- PubMed
- Google Scholar
1. Zhu S
2. Liu K
3. Chai Y
4. Wu Y
5. Lu D
6. Xiao W
7. Cheng H
8. Zhao Y
9. Ding C
10. Lyu J
11. Lou Y
12. Gao GF
13. Liu WJ
(2019) Divergent peptide presentations of HLA-A^*30 alleles revealed by structures with pathogen peptides
Frontiers in Immunology 10:1709.

https://doi.org/10.3389/fimmu.2019.01709
- PubMed
- Google Scholar

Article and author information

Author details

Alyssa Lyn Fortier
1. Department of Biology, Stanford University, Stanford, United States
2. Department of Genetics, Stanford University, Stanford, United States
Contribution
Data curation, Software, Formal analysis, Investigation, Visualization, Methodology, Writing – original draft

For correspondence
afortier@stanford.edu

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0001-5964-2540
Jonathan K Pritchard
1. Department of Biology, Stanford University, Stanford, United States
2. Department of Genetics, Stanford University, Stanford, United States
Contribution
Conceptualization, Resources, Supervision, Funding acquisition, Investigation, Methodology, Project administration, Writing - review and editing

Competing interests
No competing interests declared

"This ORCID iD identifies the author of this article:" 0000-0002-8828-5236

Funding

National Institutes of Health (R01 HG011432)

Alyssa Lyn Fortier
Jonathan K Pritchard

National Institutes of Health (R01 HG008140)

Alyssa Lyn Fortier
Jonathan K Pritchard

National Science Foundation (DGE-1656518)

Alyssa Lyn Fortier

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We acknowledge support from NIH grants R01 HG011432 and R01 HG008140. This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-1656518. We appreciate helpful comments from Jeffrey Spence, the Pritchard lab, and the reviewers of the previous version of this work.

Version history

Preprint posted: September 17, 2024
Sent for peer review: October 8, 2024
Reviewed Preprint version 1: January 8, 2025
Reviewed Preprint version 2: June 30, 2025
Version of Record published: September 12, 2025

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.103547. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.