Figures and data


Peptides isolated from class I molecules of chicken blood, spleen and a B21 cell line separated by HPLC with single peaks subjected to gas phase sequencing show a range of lengths and sequences.
Methods are detailed in Chappell et al., 2015; Koch et al., 2008; Wallny et al., 2006. The peptides are grouped by the six experiments with locations and cell sources indicated. Sequences are in single letter amino acid code, small letter indicates dominant amino acid in a position, and x indicates ambiguous call. The amino acids are coloured throughout this paper according to their primary characteristic (with the understanding that some amino acids could be characterised in more than one way, eg. Y both hydrophobic and polar, K basic and hydrophobic, etc): red, acidic, D and E; blue, basic, H, K and R; green, polar, N, Q, S and T; black, hydrophobic, A, C, F, G, I, L, M, P, V, W and Y). Designations on the left indicate synthetic peptides used for structures (S), thermostability assays (T) or assembly assays (A), or found as natural peptide by immunopeptidomics (I). Chicken genes from which a (partial) peptide sequence could be identified are presented, with the number of amino acids positions matching compared to total length in brackets.

Representative SEC trace for assembly assays show four typical outcomes for BF2*21:01.
After refolding of bacterially-expressed heavy chain and β2-microglobulin with synthetic peptides according to published methods (Chappell et al., 2015; Koch et al., 2008), samples were spun in a cooled microfuge at full speed for 30 minutes, the supernatant was put through a 0.2 microfilter and then the flow-through was loaded on a SEC column (in this case, a Superdex 200 10/300 GL column) attached to an AKTA FPLC instrument. In every run, peaks at and slightly after the exclusion volume represent aggregated material, and a peak before the inclusion body represented refolded β2-microglobulin. In between, a sharp peak represented stably refolded monomers of heavy chain, β2-microglobulin and peptide (peptide with blue trace), a broader peak beginning at the same place as the monomer peak presumably represented refolded monomers that were dissociating during chromatography (red), a later broad was at the same position as heavy chain refolded alone (brown) and for some peptides no peak in between the aggregate peaks and the β2-microglobulin peak, representing no successful refolding (green). The x-axis represents elution volume in ml, while the y-axis represents OD215 in mAU.

Assembly of BF2*21:01 molecules by refolding in vitro is markedly affected by single amino acid differences between peptides.
Bacterial-expressed heavy chain and β2-microglobulin were refolded in vitro with single synthetic peptides (in-house identification of peptide batch by P and number), shown as single letter code with anchor residues underlined and coloured (red, acidic; blue, basic; black hydrophobic), changes from original peptide bolded, and a dot to indicate a residue skipped. Result after analysis by SEC given as +, sharp peak at position of monomer; b, broad peak starting at position of monomer; h, sharp peak at position of heavy chain alone; 0, no peak in between aggregated material at the exclusion volume of the column and the peak representing refolded β2-microglobulin. Methods as in Fig. 2.

The original 10mer (REVDEQLLSV) and 11mer (GHAEEYGAETL) peptides refold with BF2*21:01 to give stable monomers as do 11mer and 10mer derivative peptides, but 9mer, 8mer or 7mer peptides give heavy chain only peaks.
Assembly assays for representative 10mer and 11mer peptides give peaks for stable monomers of BF2*21:01, but shorter peptides based on the 11mer do not. Single synthetic peptides were refolded with bacterial-expressed heavy chain and β2-microglobulin with synthetic peptides, prepared for SEC as described in the legend for Fig. 2, and analysed in this case with a HiLoad 26/60 Superdex 200 column attached to an AKTA FPLC instrument. The peptides 11mer GHAEEYAETL (top panel), 10mer REVDEQLLSV middle panel), 11mer GHAEAAAAETL and 10mer GHAEAAAETL (bottom panel) gave sharp monomer peaks, while the 9mer GHAEAAETL, 8mer GHAEAETL and 7mer GHAEETL gave a delayed broad peak indicative of unstable binding or heavy chain.

Crystal structures of BF2*21:01 with peptides that refold to give stable monomers and models of structures with peptides that either did not refold to give stable monomers or refolded to give monomers with low thermal stability.
A. Structures were determined for GHAEEYGAETL (3BEV), GHAEEYGADTL (5ADO) and GRAEEYGADTL (5ACZ), which all refolded successful to give stable monomers, while GRAEEYGAETL did not (see Fig. 3), all models of which showed steric clashes (one depicted, red arrow). B. Two 11mers refolded to give monomers, which were much more stable for GHAEEYGAETL than GHPDWYYGETW (see Fig. 6). Models of GHAEEYGAETW and GHPDWYYGETW show steric clashes (red arrows), but the crystal structure with GHAEEYGAETL (3BEV) and a model of GHPDWYYGETL do not (green arrows). Underlined positions indicate anchor residues; bold letters indicate amino acid substitutions, modelled positions in orange. Methods for refolding and analysis as in Fig. 2, modelling done as detailed in Materials and Methods, structure from Koch et al., 2008.

Peptide length and amino acids at peptide positions P2, P3, Pc-3, Pc-2 and Pc affect stability of BF2*21:01 molecules.
Bacterial-expressed heavy chain and β2-microglobulin were refolded in vitro with various synthetic peptides and melting point measured as 50% uptake of a fluorescent dye that binds to hydrophobic regions, as described in Materials and Methods. Assays were performed in triplicate with error bars indicating standard errors. Peptide sequences are in single letter code with in-house identification of peptide batch by P and number.

Assembly assays with single-substitution peptide libraries based on the 11mer GHAEEYAETL and the 10mer REVDEQLLSV show marked preferences for particular amino acids at P2, P3, Pc-3, Pc-2 and Pc positions binding to BF2*21:01.
Bacterial-expressed heavy chain and β2-microglobulin were mixed with peptide libraries (19 amino acids, all except Cys, at one position) at two molar ratios (1:2:10 as usual, and 1:2:2 to control for strong binding peptides) and separated by SEC as described in the legend to Fig. 2; monomer peaks were collected, concentrated, treated with acid, lyophilised and analysed by MALDI-TOF. Bar charts show proportion of total signal (with error bars when present showing standard error around the mean for separate refolding experiments) on y-axis for mass/charge (m/z) positions corresponding to peptides with particular amino acids on the x-axis (single letter code; Ile and Leu were indistinguishable, as were Gln and Lys).

Assembly assays with double-substitution peptide libraries based on the 11mer GHAEEYAETL (top panel) and the 10mer REVDEQLLSV (bottom panel) show marked preferences for particular amino acids combinations at P2 and Pc-2 positions binding to BF2*21:01, with more combinations found for the 10mer.
Bacterial-expressed heavy chain and β2-microglobulin were mixed with peptide libraries (19 amino acids, all except Cys, at two positions indicated by x), separated immediately by SEC as described in the legend to Fig. 2, or incubated at 4°C or 42°C for 18 hours before SEC; monomer peaks were collected, concentrated, treated with acid, lyophilised and analysed by MALDI-TOF. Bar charts show proportion of total signal (with error bars when present showing standard error around the mean for separate refolding experiments) on x-axis for mass/charge (m/z) positions corresponding to peptides with particular amino acid combinations on the y-axis (single letter code; Ile and Leu, Gln and Lys, and amino acids present in P2 versus Pc-2 positions were indistinguishable).


A-E. Assembly assays with double-substitution peptide libraries based on five 11mer peptides (GHPDWYYGETW and four derivatives of the 11mer GHAEEYGAETL) and five 10mers (all from the original peptides found by gas phase sequencing, including the original REVDEQLLSV) show different preferences for particular amino acid combinations at P2, P3, Pc-3 and Pc-2 positions binding to BF2*21:01, with more combinations found for 10mer peptides. Bacterial-expressed heavy chain and β2-microglobulin were mixed with peptide libraries (19 amino acids, all except Cys, at one position, indicated by x in the peptide sequence) at two molar ratios (1:2:10 as usual, and 1:2:2 to control for strong binding peptides) and separated by SEC as described in the legend to Fig. 2 and separated immediately by SEC; monomer peaks were collected, concentrated, treated with acid, lyophilised and analysed by MALDI-TOF. Bar charts show proportion of total signal (with error bars when present showing standard error around the mean for separate refolding experiments) on x-axis for mass/charge (m/z) positions corresponding to peptides with particular amino acid combinations on the y-axis (single letter code; Ile and Leu, Gln and Lys, and amino acids present in P2 versus Pc-2 positions were indistinguishable). Panels: A. GHxEEYGxETL, B. GExEEYGxLTL, C. GxPDWYYGxTW, D. GxVEEYGAxTL, E. GxAEEYGAxTL, F. TxGQEDYxRL, G. TxPESKVxYL, H. YxLDEKFxRL, I. RxVDEQLxSV, J. RxNYGIIxSF.



A-B. Comparison of assembly assays of double-substitution libraries by analysis with MALDI-TOF and with LC-MS/MS give similar results, with different preferences for particular amino acid combinations at P2, P3, Pc-3 and Pc-2 positions binding to BF2*21:01, with more combinations for 10mer peptides. Bacterially-expressed heavy chain and β2-microglobulin were mixed with peptide libraries (19 amino acids, all except Cys, at two positions indicated by x in the peptide sequence) at two molar ratios (1:2:10 as usual, and 1:2:2 to control for strong binding peptides) and separated by SEC as described in the legend to Fig. 2; monomer peaks were collected, concentrated, treated with acid, lyophilised and analysed. Separate experiments were analysed by MALDI-TOF (blue bars, amino acids in single letter code with small letters, order not meaningful) and by LC-MS/MS (orange and green stacked bars, amino acids in single letter code with capital letters, first letter corresponding to P2 or P3 and second letter corresponding to Pc-2 or Pc-3, depending on peptide). Bar charts show proportion of total signal (with error bars when present showing standard error around the mean for separate refolding experiments) on x-axis for mass/charge (m/z) positions corresponding to peptides with particular amino acid combinations on the y-axis. Panels: A. GxAEEYGAxTL, B. GxVEEYGAxTL, C. GxPDWYYGxTW, D. RxVDEQLxSV, E. GExEEYGxLTL, F. GHxEEYGxETL.

The frequencies of particular amino acids found at positions P2 and Pc-2 of peptides bound to class I molecules from the B21 haplotype vary considerably, as assessed by LC-MS/MS of peptides from two double-substitution libraries with BF2*21:01 and from immunopeptidomics of a B21 cell line.
For assembly assays, bacterial-expressed heavy chain and β2-microglobulin were refolded in vitro with either GxAAEYGAxTL or RsVDEQLxSV (single letter code; x means roughly equal proportions of 19 naturally-occurring amino acids, all except Cys), and separated by SEC; monomer peaks were collected, concentrated, treated with acid, and lyophilised. For immunopeptidomics, class I molecules from detergent-solubilised AVOL-1 cells were isolated by affinity chromatography with monoclonal antibody F21-2. Samples were analysed by LC-MS/MS, with bar graphs showing the percentage of sequences with different amino acids (single letter code) at P2 or Pc-2 found by mass spectroscopy; immunopeptidomic data for 10mer peptides only.

The frequencies of particular amino acids found at positions P2 and Pc-2 in peptides bound to BF2*21:01 vary considerably and are affected by the amino acid present at P3, as assessed by LC-MS/MS of peptides from assembly assays with three double-substitution libraries.
Bacterially-expressed heavy chain and β2-microglobulin were refolded in vitro with either GxAEEYGAxTL, GxVEEYGAxTL or GxPDWYYGxTW (single letter code; x means roughly equal proportions of 19 naturally-occurring amino acids, all except Cys) and separated by SEC as described in the legend to Fig. 2; monomer peaks were collected, concentrated, treated with acid, lyophilised and analysed by LC-MS/MS, with bar graphs showing the percentage of sequences with different amino acids (single letter code) at P2 or Pc-2 found by mass spectroscopy.

The frequency of particular amino acids found at positions P3 and Pc-3 in peptides bound to BF2*21:01 is affected by the amino acids present at P2 and Pc-2, as assessed by LC-MS/MS of peptides from assembly assays with two double-substitution libraries.
Bacterially-expressed heavy chain and β2-microglobulin were refolded in vitro with either GExEEYGxLTL or GHxEEYGxETL (single letter code; x means roughly equal proportions of 19 naturally-occurring amino acids, all except Cys) and separated by SEC as described in the legend to Fig. 2; monomer peaks were collected, concentrated, treated with acid, lyophilised and analysed by LC-MS/MS, with bar graphs showing the percentage of sequences with different amino acids (single letter code) at P3 or Pc-3 found by mass spectroscopy.

Most peptides found by immunopeptidomics of a B21 cell line were between 8 and 12 amino acids in length, with the most frequent being 10mers, all of which had complex peptide motifs.
For immunopeptidomics, class I molecules from detergent-solubilised AVOL-1 cells were isolated by affinity chromatography with monoclonal antibody F21-2. Samples were analysed by LC-MS/MS, with bar graphs showing the number of sequences with different lengths as found by mass spectroscopy. Peptide motifs were determined by Gibbs clustering.

The frequency of particular amino acids found at positions P2, Pc-2 and Pc in peptides bound to class I molecules from a B21 cell line varies depending on length of peptide, with Leu predominant at Pc for all lengths, and with 8mers more like 10mers at P2 and Pc-2.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with different amino acids (single letter code) at P2, Pc-2 and Pc found by mass spectroscopy.

The frequency of particular amino acids found at positions P2, Pc-2 and Pc in peptides bound to class I molecules from a B21 cell line varies depending on length of peptide, but patterns found in peptides up to 12mers are not so obvious in longer peptides.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with different amino acids (single letter code) at P2, Pc-2 and Pc found by mass spectroscopy.

The frequencies of particular amino acids found at position Pc-2 in peptides bound to class I molecules from a B21 cell line vary depending on whether the amino acids at P2, are acidic (Asp and Glu), basic (His, Lys and Arg), polar (Asn, Gln, Ser and Thr) or hydrophobic (Ala, Cys, Phe, Gly, Ile, Leu, Met, Pro, Val, Trp and Tyr), but Leu is found overwhelmingly at Pc in all peptides.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with different amino acids (single letter code) at P2, Pc-2 or Pc found by mass spectroscopy.

Many different combinations of amino acids at positions P2 and Pc-2 in peptides bound to class I molecules from a B21 cell line are found by immunopeptidomics, but only a few are found frequently.
Peptides identified by immunopeptidomics as in Fig. 14, with heat map showing combinations of amino acids (single letter code, with P2 on left and Pc-2 on top) at intersections, with empty cells showing no peptides found, and colours showing numbers of peptides found highlighted with low frequency in blue to high frequency in red.

Only a few combinations of amino acids at positions P2 and Pc-2 in peptides bound to class I molecules from a B21 cell line are found at frequencies of 1% or higher.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with different combinations of amino acids at P2 and Pc-2 (single letter code, with P2 first and then Pc-2) found by mass spectroscopy.

The frequencies of particular amino acids found at positions P3 and Pc-3 in peptides bound to class I molecules from a B21 cell line vary depending on length of peptide.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with different amino acids (single letter code) at P3 or Pc-3 found by mass spectroscopy.

The frequencies of groups of amino acids (acidic, basic, large hydrophobic, small and Pro) found at position P3 in peptides bound to class I molecules from a B21 cell line vary depending on the amino acid at P2 and on the length of peptide.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with amino acids at P2 on the x-axis against different groups of amino acids (single letter code: acidic being Asp and Gly; basic being His, Lys and Arg; large hydrophobic being Phe, Ile, Leu, Met, Trp and Tyr; small being Ala, Cys, Gly Ser; Pro) at P3 on the y-axis, as found by mass spectroscopy.

The frequencies of groups of amino acids (acidic, basic, large hydrophobic, small and Pro) found at positions Pc-3 in peptides bound to class I molecules from a B21 cell line vary depending on the amino acid at Pc-2 and on length of peptide, with 8mers more like 10mers.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with amino acids at Pc-2 on the x-axis against different groups of amino acids (single letter code: acidic being Asp and Gly; basic being His, Lys and Arg; large hydrophobic being Phe, Ile, Leu, Met, Trp and Tyr; small being Ala, Cys, Gly Ser; Pro) at Pc-3 on the y-axis, as found by mass spectroscopy.

The frequency of Leu found at position Pc in peptides bound to class I molecules from a B21 cell line vary depending on length of peptide.
Peptides identified by immunopeptidomics as in Fig. 14, with bar graphs showing the percentage of sequences with Leu at Pc-2 on the y-axis against amino acids (single letter code) at Pc on the y-axis, as found by mass spectroscopy.

The frequencies of groups of amino acids (Leu, other hydrophobic, acidic, basic, polar) found at Pc in peptides bound to class I molecules from a B21 cell line vary depending on the amino acid at Pc-2 and on length of peptide, with significant proportions of acidic, basic and/or polar amino acids compared to Leu at Pc for certain amino acids at Pc-2 in 9mers and 11mers.
Peptides identified by immunopeptidomics as in Fig. 14, with stacked bar graphs showing the number of sequences with Leu, other hydrophobic (Cys, Phe, Gly, Ile, Met, Pro, Val, Trp and Tyr), acidic (Asp and Glu), basic (His, Lys and Arg) and polar (Asn, Gln, Ser and Thr) amino acids at Pc against length of peptide and amino acid (single letter code) at Pc-2, as found by mass spectroscopy.

Superimposition of peptides from eight structures of BF2*21:01 show that the C-terminal portions all have similar conformations, while the N-terminal portions vary considerably.
Bacterially-expressed heavy chain and β2-microglobulin were refolded in vitro with various synthetic peptides, purified by SEC, crystallized and the X-ray structures determined. Depicted are the main chains, with N-termini on the left, viewed from side of α2 helix of the class I molecule. Structures are 3BEV (GHAEEYGAETL, green) and 3BEV (REVDEQLLSV, light blue) (Koch et al., 2008), 2YEZ (TNPESKVFYL, pink), 4D0B (TAGQEDYDRL, yellow), 4D0C (TAGQSNYDRL, wheat) and 4CVZ (YELDEKFDRL, grey) (Chappell et al., 2015), and 5AD0 (GHAEEYGADTL, dark blue) and 5ACZ (GRAEEYGADTL, orange) (this publication).
