1. Structural Biology and Molecular Biophysics
  2. Immunology and Inflammation
Download icon

Computationally-driven identification of antibody epitopes

  1. Casey K Hua
  2. Albert T Gacerez
  3. Charles L Sentman
  4. Margaret E Ackerman
  5. Yoonjoo Choi  Is a corresponding author
  6. Chris Bailey-Kellogg  Is a corresponding author
  1. Dartmouth College, United States
  2. Korea Advanced Institute of Science and Technology (KAIST), Republic of Korea
Research Article
  • Cited 1
  • Views 1,732
  • Annotations
Cite this article as: eLife 2017;6:e29023 doi: 10.7554/eLife.29023

Abstract

Understanding where antibodies recognize antigens can help define mechanisms of action and provide insights into progression of immune responses. We investigate the extent to which information about binding specificity implicitly encoded in amino acid sequence can be leveraged to identify antibody epitopes. In computationally-driven epitope localization, possible antibody–antigen binding modes are modeled, and targeted panels of antigen variants are designed to experimentally test these hypotheses. Prospective application of this approach to two antibodies enabled epitope localization using five or fewer variants per antibody, or alternatively, a six-variant panel for both simultaneously. Retrospective analysis of a variety of antibodies and antigens demonstrated an almost 90% success rate with an average of three antigen variants, further supporting the observation that the combination of computational modeling and protein design can reveal key determinants of antibody–antigen binding and enable efficient studies of collections of antibodies identified from polyclonal samples or engineered libraries.

https://doi.org/10.7554/eLife.29023.001

Introduction

Antibodies have long been recognized for their beneficial roles in vaccination, infection, and clinical therapy, as well as their pathogenic roles in autoimmunity. The protective and/or pathogenic capacity of an antibody (Ab) is functionally delimited by the specific epitope(s) that it recognizes on an antigen (Ag). Thus, even Abs targeting the same Ag have demonstrated variable efficacy dependent upon their epitope specificities. In cancer, Abs against particular epitopes have demonstrated increased therapeutic effects and decreased off-tumor toxicities (Garrett et al., 2009; Gan et al., 2012; Kim et al., 2002), and combinations of mAbs targeting diverse epitopes have demonstrated synergistic action and delayed the development of treatment resistance (Koefoed et al., 2011; Friedman et al., 2005). Similarly, Abs against particular epitopes have been associated with protection in the setting of vaccination (Zolla-Pazner et al., 2014; Gottardo et al., 2013; Steel et al., 2010; Gocník et al., 2007; Liu et al., 2016; Eggink et al., 2014; Margine et al., 2013) and natural infection (Walker et al., 2010; Lu et al., 2016). As a result, the identification of epitopes contributing to potent antibody bioactivity is rapidly gaining attention in vaccine design efforts (Haynes, 2015; West et al., 2014; Zolla-Pazner et al., 2016; Correia et al., 2014; He et al., 2015; Lanzavecchia et al., 2016; Pica and Palese, 2013; Subbarao and Matsuoka, 2013) as well as in reducing immunogenic responses against protein drug candidates (Nagata and Pastan, 2009; Onda et al., 2011; Onda et al., 2008).

While characterization of epitope specificities is important for both scientific investigation and clinical translation, the epitopes targeted by newly isolated Abs are often unknown. In the therapeutic setting, novel Ag-specific antibodies are typically discovered through in vitro selections or in vivo immunizations using whole Ag proteins (Lu et al., 2012; Nelson et al., 2010). Such efforts generate multiple Ab candidates simultaneously, which may target multiple different (and unknown) regions on the Ag. Since only a limited number of candidates may be taken forward, it may be helpful at this early stage to distinguish their modes of recognition, which in turn may impact their mechanisms of action (Ditto and Brooks, 2016; Brooks et al., 2014). Similarly, in the vaccine setting, Abs purified from subject sera may target a wide range of epitopes, some previously determined but some with potent new modes of binding, potentially conferring different protective mechanisms (Zolla-Pazner, 2004; Lewis, 2010; Khurana et al., 2016). Because next generation sequencing and more robust Ab discovery platforms are greatly expanding the repertoire of Ag-specific Abs with known sequences (Doria-Rose et al., 2014; Liao et al., 2013; Wu et al., 2015), efficient identification of epitopes from Ab sequence is a highly attractive target (Robinson, 2015) by which to monitor the immune response, characterize the development of Ab repertoires, and potentially lead to novel discoveries and new therapies.

To characterize Ab epitopes, structure determination (Saul and Alzari, 1996) is the gold standard (Abbott et al., 2014), but may be impractical in early stages or in investigations involving multiple Abs. Consequently, a variety of other methods have been developed that trade off resolution in favor of reduced time and expense. For example, epitope-binning assays (Ditto and Brooks, 2016; Brooks et al., 2014) can compare tens or even hundreds of Abs at a time, but are currently limited in resolution to identifying only which specificities overlap. Site-directed mutagenesis approaches, such as alanine scanning (Weiss et al., 2000), have become relatively routine (Greenspan and Di Cera, 1999) and can identify specific Ag residues critical for Ab binding, but require expression and testing of fairly large numbers of variants. Recently, Kowalsky et al. scaled this approach significantly, demonstrating that a comprehensive combination of mutagenesis, surface display, and deep sequencing can provide fast and effective fine epitope mapping (Kowalsky et al., 2015). Spectrometry-based methods such as HDX-MS (Gallagher and Hudgens, 2016; Huang and Chen, 2014) and NMR (Zuiderweg, 2002) similarly offer very detailed resolution of residues involved, but require expensive equipment and specific expertise in processing and analyzing results and may be subject to protein size limitations.

In comparison to experimental efforts, computational analysis is efficient and inexpensive, and thus much effort has gone into developing methods to predict epitopes in silico. Many methods have attempted to perform epitope prediction in the absence of any information about particular antibodies (reviewed in [Gao and Kurgan, 2014; Zhang et al., 2014]), in order to help predict the overall immunogenicity of an Ag. Unfortunately, recent collaborative efforts between computational and experimental researchers have suggested that since any Ag surface region has the potential to serve as an epitope, such predictions may be of limited practical utility (Sela-Culang et al., 2015). In contrast, computational efforts to predict epitopes for specified Abs (given the amino acid sequence and potentially a structure or homology model) address this concern and have recently made substantial progress (Sela-Culang et al., 2015; Zhao and Li, 2010; Zhao et al., 2011; Soga et al., 2010; Brenke et al., 2012; Krawczyk et al., 2014; Sircar and Gray, 2010; Sela-Culang et al., 2013). While promising, purely computational methods are not yet sufficiently reliable to stand alone (Sela-Culang et al., 2013; Yao et al., 2013). Thus, a recently proposed paradigm integrates computational and experimental methods, leveraging the advantages of each (Sela-Culang et al., 2015). Integrated methods to date are predicated on the availability of initial experimental information to improve computational predictions, which are subsequently experimentally tested to definitively identify epitopes (Kowalsky et al., 2015; Araya and Fowler, 2011; Chuang et al., 2013; Sela-Culang et al., 2014).

We investigate the extent to which the Ab-Ag recognition information encoded in the proteins themselves can be harnessed to drive the entire epitope localization process. In particular, we study the ability of computational methods to optimize experimental validation of computational predictions, thereby focusing and minimizing experimental effort. Ab-Ag docking benchmarks have shown that at least one near-native docking model can usually be found among generated samples; unfortunately, such methods fail to reliably indicate which one (Brenke et al., 2012). However, the models can be viewed as hypotheses (though not necessarily mutually exclusive) to be experimentally tested via site-directed mutagenesis and binding assays. Through EpiScope, an integrated computational-experimental approach (Figure 1), Ag variants are designed for each docking model such that, if a model is consistent with the true binding mode, Ab binding will be ablated in the corresponding Ag variant(s). The variants are distilled to a small set representing all docking models such that if any of the models is correct, one of the variants will fail to bind the Ab. Experimental identification of a variant with disrupted binding enables localization of the epitope to include one or more of the mutated residues. The docking models are then filtered to a smaller set (there need not be a unique one) representing binding modes and corresponding footprints on the Ag that are consistent with the effects of the disruptive mutations. As we demonstrate in prospective application to two Abs against a tumor antigen, as well as thorough retrospective testing with a wide range of Ab-Ag pairs, Ab sequence and Ag structure alone are sufficient to drive efficient targeting of experimental effort to effectively localize epitopes. We further demonstrate that decoding binding information from sequence enables the multiplexing of experimental epitope localization efforts for multiple Abs targeting the same Ag.

Overview of computationally-driven epitope identification by EpiScope.

(A) Ab–Ag docking models are generated using computational docking methods. In the example, the green structure is the Ag human IL-18 (PDB ID: 2VXT:A), while the cartoons represent possible poses of the Ab (limited here to three for clarity). Full details including docking models and designs for this example are provided in a PyMol session (Supplementary file 1). (B) Ag variants containing a pre-defined numbers of mutations (here triple mutations, colored triangles) are generated for each docking model. (C) Variants are clustered with respect to spatial locations in the Ag, and a set of variants predicted to disrupt all of the docking models is selected. (D) Ag mutagenesis and Ag-Ab binding experiments are performed to identify which mutations result in loss of Ab recognition. (E) Examination of the disruptive variant(s) enables localization of the Ab epitope in terms of both mutated positions (pink balls) and consistent docking models, here with the model (light pink cartoon) quite similar to the actual crystal structure (dark pink cartoon).

https://doi.org/10.7554/eLife.29023.002

Results

Computationally-driven Ab epitope identification: EpiScope

The integrated computational-experimental framework is described in Figure 1 (full details are provided as a PyMol session file, Supplementary file 1) and was implemented as follows. Ab–Ag docking models were generated by the ClusPro server (Brenke et al., 2012; Comeau et al., 2004a), which in a recent Ab–Ag docking benchmark (Brenke et al., 2012) demonstrated a near-native docking model among its top 30 predictions in 95% of test cases. For each model, site-directed mutagenesis based Ag variants were computationally designed (Choi et al., 2017; Parker et al., 2013) to disrupt Ab binding, as evaluated by a sequence potential (Pons et al., 2011), while maintaining Ag stability, as evaluated by molecular mechanics modeling (Gainza et al., 2013; Pearlman et al., 1995). These two properties were balanced in a Pareto optimal fashion (He et al., 2015), with the goal of ensuring that variants still express and fold similarly to the wild type protein, thereby enabling confident interpretation of Ab binding results. Designed variants were clustered to identify a minimal set predicted to disrupt binding to all of the docking models, ensuring coverage of all computational hypotheses. The selected Ag designs were experimentally evaluated for retention or loss of Ab binding, where loss of Ab binding signal suggests overlap of the true Ab epitope with at least one of the designed mutations in that variant.

Epitope localization of two monoclonal Abs

Our investigation of computationally-driven epitope mapping was prompted by studies of two previously uncharacterized antibodies against tumor Ag B7H6: TZ47, a murine antibody generated through mouse immunizations (Choi et al., 2015), and PB11, a human scFv generated through directed evolution of a human Ab fragment library (Feldhaus et al., 2003). EpiScope designed a set of 4 (for TZ47) and 5 (for PB11) triple-mutant B7H6 variants to probe all of the docking models (Figure 2A and D and Table 1. Design details including docking models are provided in a PyMol session file in Supplementary file 2, and all sequence details are provided in Supplementary file 3). There were 28 docking models for each Ab, and each variant was predicted to disrupt between 5 and 12 of the models. The designed B7H6 variants were expressed in the context of the full-length transmembrane protein on the surface of human embryonic kidney (HEK) cells and evaluated for Ab binding via flow cytometry. All variants maintained binding to NKp30, the natural ligand for B7H6 (Figure 2B and E), suggesting proper expression and folding. Lower NKp30 binding signal for PB-Ag2 may result from the proximity of designed mutations to the NKp30 binding site, rather than changes in Ag expression or stability. Conversely, a lack of binding to negative control antibodies with alternative Ag specificities demonstrated that the introduced mutations did not facilitate indiscriminate Ab binding. Altogether, these findings suggest that binding changes for designed variants resulted from specific disruption of Ab binding interfaces, rather than altered Ag stability or structure.

Figure 2 with 1 supplement see all
Small sets of designed Ag variants enable epitope localization for two different B7H6-targeting Abs.

(A–C). TZ47; (D-F) PB11. (A and D) Designed Ag variants, color-coded by triple mutation sets (Table 1). NKp30, a natural ligand for B7H6, is shown in grey ribbon. (B and E) Flow cytometry results from staining variant-expressing HEK cells with the relevant Ab, using NKp30-Ig as a positive control. Fluorescence is normalized to WT Ag-expressing cells. The dotted lines represent average background fluorescence measured from negative control Abs. Experiments were conducted in triplicate and error bars show the standard deviation. (C and F) Docking models (Ab cartoons of different colors) affected by the disruptive Ag variants (highlighted in red for TZ47 and green for PB11). Bar graphs depict the average (height) and standard deviation (error bars) of the MFI of 3 technical replicates, defined as the equivalent staining of a single batch of transfected cells repeated in three separate wells in the same experiment. One outlier value was excluded (PB11-staining of PB11-Ag1) where fewer than 1500 live cells were sampled and the raw MFI was two orders of magnitude higher than the other two replicates (1145.6 vs. 14.41 and 14.20).

https://doi.org/10.7554/eLife.29023.003
Table 1
Summary of mutations in EpiScope Ag designs for each Ab.

Designs that disrupted binding for each Ab are highlighted.

https://doi.org/10.7554/eLife.29023.007
DesignMutations
TZ47-Ag1F47Y, N49Q, W98E
TZ47-Ag2F184D, I188Q, V225T
TZ47-Ag3T71K, K74E, V76H
TZ47-Ag4M154E, N157G, S217H
PB-Ag1M30V, Q132V, Q136L
PB-Ag2F51H, Y52D, R99G
PB-Ag3A88T, F89T, G111R
PB-Ag4T176K, V194I, R231E
PB-Ag5N216K, S217A, Q219V

Two Ab-specific designs, TZ47-Ag4 and PB11-Ag2, reduced Ab binding to levels comparable to the negative control while maintaining binding to the natural ligand (Figure 2B and E), suggesting that at least one of the three designed mutations in each variant is part of the epitope. While this constitutes successful epitope localization, the results can be further interpreted in terms of the docking models with binding interfaces disrupted by these mutations. There were five such models for each Ab (out of the original 28 each), substantially limiting the possible epitope regions to 17.4% and 26.7% of the surface respectively (out of the original 82% and 88% covered by the sets of docking models) (Figure 2C and F). Thus, a small set of designs localized the epitope on the Ag in terms of disruptive mutations and the footprints of the docking models consistent with those effects.

In some cases, it may be desirable to pursue follow-up experiments to obtain finer resolution of the epitope guided by the initial coarse-grained localization. Here, a chimera-based approach was used to further probe the TZ47 epitope, based on the identified disruptive design TZ47-Ag4 along with prior experimental results demonstrating that TZ47 cannot recognize macaque B7H6 Ag (despite ~75% identity to human B7H6). The chimera SD9 (Figure 2—figure supplement 1) contains the macaque B7H6 sequence in the region of the designed mutations in TZ47-Ag4, differing from the human sequence by four amino acids including TZ47-Ag4’s mutation at M154, where the macaque sequence contained a similarly hydrophobic valine (V) and TZ47-Ag4 contained a more dramatic change to a negatively charged glutamate (E). Chimera SD9 similarly disrupted TZ47 binding, reconfirming the importance of the common mutation site and general epitope region.

Ag variant design for simultaneous localization of both Abs

Despite the large distance separating the localized epitopes of TZ47 and PB11 (Figure 3—figure supplement 1), significant overlap was observed between the initial docking models for the two Abs (Figure 3—figure supplement 2). This overlap led to similarities in the Ag variant designs, with TZ47-targeted designs covering 23/28 PB11 docks and PB11-targeted designs covering 25/28 TZ47 docks. These results suggested that the Ab-specificity of docking models is limited and that greater experimental efficiency could be achieved by optimizing designs to disrupt predicted epitopes common to the Abs.

To determine if experiments could be designed to take advantage of similarities in possible binding modes while also accounting for differences, we generated an integrated set of 6 designs (Table 2) interrogating all 56 docking models generated for the two antibodies combined (Figure 3A; full details are provided in a PyMol session in Supplementary file 2). This simultaneous design scheme represents a substantial reduction from the initial total of 9 designs (four for TZ47 and 5 for PB11) to separately localize each epitope, but still resulted in successful localization of both Abs, with two different variants successfully disrupting binding to the two Abs (Figure 3B). These disruptive variants overlap five docking models each (Figure 3C), the majority of which were also affected by disruptive designs based on individual Abs, demonstrating agreement in the localization of TZ47 and PB11 epitopes from both single Ab-input and multiple Ab-input designs to a few docking models. We conclude that the multi-Ab approach can both decrease the required experimental effort and increase the likelihood of successful epitope localization, offering the novel capacity to multiplex epitope localization efforts through the rational design of Ag variant panels to simultaneously probe multiple Ab inputs.

Figure 3 with 2 supplements see all
A single set of Ag variants enables simultaneous localization of two different B7H6-targeting Abs.

(A) Designed Ag variants color-coded by triple-mutant design, with natural binding partner NKp30 in grey ribbon. (B) Flow cytometry results from staining variant-expressing HEK cells with the relevant Ab, using NKp30 as a positive control. Fluorescence was normalized to WT antigen-expressing cells. The dotted line represents average background fluorescence measured from negative control Abs. (C) Docking models (Ab cartoons of different colors) affected by disruptive Ag variants (highlighted in orange for TZ47 and magenta for PB11), for left: TZ47 and right: PB11. Bar graphs depict the average (height) and standard deviation (error bars) of the MFI of 3 technical replicates, defined as the equivalent staining of a single batch of transfected cells repeated in three separate wells in the same experiment. One replicate value was excluded where fewer than 1500 live cells were sampled from the well (one replicate of PB11-staining of MULTI-1) and the raw MFI was two orders of magnitude larger than the other two replicates (232.8 vs. 2.55 and 3.71).

https://doi.org/10.7554/eLife.29023.008
Table 2
Summary of mutations in Multi-Ab specific EpiScope Ag designs.

Designs that disrupted binding for each Ab are highlighted.

https://doi.org/10.7554/eLife.29023.012
DesignMutations
MULTI-1N57D, D84N, W98E (PB11)
MULTI-2F66Y, T71K, F72D
MULTI-3V78L, F89T, G111R
MULTI-4M154E, N157E, N216K (TZ47)
MULTI-5A172H, R231E, A233E
MULTI-6T176K, R231E, H236S

Generalizability of localizing Ab epitopes from sequence-encoded information

To assess the generalizability of harnessing Ab sequence-encoded binding information to design efficient epitope localization experiments, we designed Ag variant sets for 33 distinct Ab-Ag complexes with high quality crystal structures (Table 3). In these tests, an Ab epitope was considered successfully localized if at least one of the generated designs contained a mutation within the Ab-Ag binding interface. By this metric, the epitope was successfully localized in 88% (29/33) of the test cases (Figure 4A). Strikingly, this success rate could be achieved using an average of only 3 Ag variants for each test case (Figure 4B). There was a weak correlation between Ag size and the number of experiments needed to probe all generated docking models (r = 0.51), but no correlation between Ag size and the rate of successful epitope localization (Figure 4—figure supplement 1).

Figure 4 with 4 supplements see all
Retrospective validation demonstrates generality of efficiency and effectiveness in localizing epitopes.

(A) Over a test set of 33 diverse Ab-Ag pairs with co-crystal structures, the number of pairs in which at least one binding interface residue is included among the disruptive mutations in a set of 1–6 Ag triple-mutant variants. Ultimately, two pairs were missed when using Ag crystal structure and three pairs when using Ag homology models. (B) Violin plots of the number of Ag variants required to incorporate mutations predicted to disrupt all docking models.

https://doi.org/10.7554/eLife.29023.013
Table 3
Retrospective test cases.

Columns indicate the PDB ID of each Ab-Ag pair; the number of residues for various subsets of the Ag; the number and success of EpiScope designs based on crystal and model Ag structures; a measure of the quality of the closest native-like docking model among ClusPro generated models (fnat[Lensink et al., 2007]); the quality of the homology models built for Ab and Ags (TM-score [Zhang and Skolnick, 2004]); and the number of docking decoys generated by ClusPro.

https://doi.org/10.7554/eLife.29023.021
PDB codeNumber of residuesCrystal structureModel structureFnatTM-scoreNumber of docking decoys
WholeSurfaceEpitopesNumber of designsOverlap with epitopesNumber of designsOverlap with epitopesCrystalModelAntibodyAntigenCrystalModel
1FE8196124274Y3N0.10.040.960.843024
1FNS196120125Y2Y0.390.090.960.862620
1H0D12396143N2Y00.050.980.793030
1LK3160102263Y3Y0.730.440.970.742329
1OAZ123101142N2N0.10.10.970.773029
1OB19974133Y4Y0.620.610.970.853029
1RJL9582133Y2Y0.290.30.960.893027
1V7M163113206Y3Y0.450.190.960.782724
1YJD14086142Y3Y0.420.130.980.772530
2ARJ12390173Y3Y0.630.260.970.752430
2VXQ9671213Y3Y0.320.210.890.893030
2VXT157116193Y3Y0.830.130.950.931319
2XQB11487182Y3Y0.160.250.920.891721
3D9A12993193Y2Y0.090.630.930.932229
3HI1290246204N3N0.280.020.970.833029
3L5X1138386Y3Y0.180.240.980.833030
3MXW169108222Y2Y0.520.470.960.922023
3QWO5748103Y4Y0.320.530.960.863030
3RKD146105185Y3Y0.550.480.970.493030
4DN47650122Y2Y0.490.280.920.882027
4DW2222175204Y3Y0.10.350.920.853030
4ETQ226186224Y3Y0.450.570.960.943030
4G3Y157114123Y4Y0.350.120.940.863030
4G6J158109133Y3Y0.570.150.970.853030
4I3S190163234Y4Y0.050.090.810.823030
4JZJ252210184Y6Y0.310.250.950.333030
4KI518310871Y1Y0.390.290.950.932420
4L5F1117992Y3Y0.520.10.970.83030
4LVH223184135Y6N0.120.050.930.93029
4M6215510562Y2N0.120.110.920.812424
4NP4272230253Y5Y0.090.070.960.673030
4RGO226187173N6Y0.160.210.970.963030
5D96235198224Y4Y0.230.150.950.963030
Average162.88122.5216.483.303.180.330.240.950.8227.1227.67
STD57.5751.535.541.191.210.210.180.030.134.533.55

We proceeded to use this dataset to investigate the impacts of the various inputs and parameters on the effectiveness of epitope localization.

Ag homology models

We investigated whether the information contained in Ag sequence would suffice to drive epitope localization via homology modeling, or whether the Ag crystal structure was required. For extra stringency, Ag homology models were based on moderately similar template structures (20–50% sequence identity, which yielded models with relatively high structural similarity, average TM scores (Zhang and Skolnick, 2004) of 0.82). Ag homology models were equally effective to crystal structures, obtaining an 85% success rate (vs. 88% for crystal structures) in epitope localization (Figure 4A and Figure 4—figure supplement 1), and still requiring an average of only three variants (Figure 4B). Thus a homology model can provide a suitable surrogate for Ag structure when no crystal structure is available. Surprisingly, use of the homology model enabled localization of two Ab epitopes missed when using the crystal structure, but failed to localize three Ab epitopes captured by use of the crystal structure. In these cases, the most native-like docking model generated for the failed Ag structure was less similar to the true binding mode (as measured by a lower fnat score (Lensink et al., 2007)) than that generated using the alternative Ag structure (Tables 3, 4 and 5).

Table 4
Ab modeling quality.

Antibody structures were generally highly accurately predicted both overall (average TM-score: 0.95) and for CDRs (all-backbone-atom, including N, C, Cα and O, RMSDs reported). Overall, non-CDR-H3 loops were very well predicted based on the canonical rules, and even for CDR-H3 loops the average RMSDs was <2 Å.

https://doi.org/10.7554/eLife.29023.022
TargetSpeciesCDR-L1L2L3H1H2CDR-H3TM-score
RMSDSequenceLength
1FE8MOUSE0.420.220.741.010.510.63AGNYYGMDY90.96
1FNSMOUSE0.540.180.930.270.602.10VRDPADYGNYDYALDY160.96
1H0DMOUSE1.430.570.420.441.110.66TRLGDYGYAYTMDY140.98
1LK3RAT0.410.430.520.571.511.00TRGVPGNNWFPY120.97
1OAZMOUSE1.150.440.881.300.561.25ARMWYYGTYYFDY130.97
1OB1MOUSE0.580.310.630.420.631.97ARNYYRFDGGMDF130.97
1RJLMOUSE1.430.574.960.691.001.16ARMRYGDYYAMDN130.96
1V7MMOUSE0.700.260.830.651.100.59SGWSFLY70.96
1YJDMOUSE0.880.511.340.621.191.76TRSHYGLDWNFDV130.98
2ARJRAT0.710.671.120.460.700.65TPLIGSWYFDF110.97
2VXQHUMAN0.350.740.960.901.271.05ARLDGYTLDI100.89
2VXTMOUSE0.470.371.140.450.530.43ARGLRF60.95
2XQBHUMAN1.610.430.981.190.897.21ARDPAAWPLQQSLAWFDP180.92
3D9AMOUSE0.400.611.180.991.880.51ANWDGDY70.93
3HI1HUMAN0.800.860.830.610.441.25ARGPVPAVFYGDYRLDP170.97
3L5XHUMAN0.560.610.911.050.901.73ARMGSDYDVWFDY130.98
3MXWHUMAN0.580.710.711.090.820.96ARDWERGDFFDY120.96
3QWOHUMANIZED0.480.281.090.870.501.13ARDMIFNFYFDV120.96
3RKDMOUSE0.620.420.521.060.651.45ARIKSVITTGDYALDY160.97
4DN4HUMAN2.110.371.581.642.402.36ARYDGIYGELDF120.92
4DW2MOUSE1.200.434.120.851.143.18ERGELTYAMDY110.92
4ETQMOUSE1.070.291.590.350.910.94TRSNYRYDYFDV120.96
4G3YCHIMERIC0.680.710.570.900.981.22SRNYYGSTYDY110.94
4G6JHUMAN0.720.440.900.410.351.14ARDLRTGPFDY110.97
4I3SHUMAN1.340.460.644.331.083.49ARQKFYTGGQGWYFDL160.81
4JZJHUMAN0.590.540.980.841.042.96ARSHLLRASWFAY130.95
4KI5MOUSE0.740.520.782.140.441.49AREDDGLAS90.95
4L5FMOUSE0.760.421.030.490.941.83TKRINWALDY100.97
4LVHMOUSE1.630.712.821.462.701.91ARHGSPGYTLYAWDY150.93
4M62HUMAN2.080.791.402.552.788.26AREGTTGSGWLGKPIGAFAY200.92
4NP4HUMAN2.210.872.840.880.551.53ARRRNWGNAFDI120.96
4RGOMOUSE0.531.010.750.710.312.20VRDLYGDYVGRYAY140.97
5D96MOUSE0.740.570.530.620.893.43ASDSMDPGSFAY120.95
Average0.920.521.250.991.011.920.95
STD0.530.201.010.780.621.710.03
Table 5
The quality of Ag models and their template structures.

Failed cases are highlighted in red.

https://doi.org/10.7554/eLife.29023.024
TargetTemplateTemplate chainSeq. ID.TM-score
1FE83PPYA28.090.84
1FNS4IGIA24.730.86
1H0D3MWQA33.880.79
1LK34DOHA27.940.74
1OAZ2PUKC48.040.77
1OB11N1IA49.440.85
1RJL2FKJC62.110.89
1V7M1CN4C23.740.78
1YJD1AH1A30.700.77
2ARJ4XMNF26.260.75
2VXQ1N10A41.300.89
2VXT4XFSA94.230.93
2XQB2PSMA69.910.89
3D9A2EQLA49.220.93
3HI12BF1A33.940.83
3L5X3BPOA99.050.83
3MXW2IBGB70.000.92
3QWO1EDKA50.940.86
3RKD3RKCA88.190.49
4DN43FPUB41.670.88
4DW22ODQA25.940.85
4ETQ2ZNCA30.560.94
4G3Y1TNRA36.430.86
4G6J3NJ5A35.370.85
4I3S2B4CA61.960.82
4JZJ4RS1A31.970.33
4KI54QDRA44.970.93
4L5F2HG0A45.920.80
4LVH5BNYA40.890.90
4M624GQXA23.940.81
4NP42GJ6A35.860.67
4RGO5FKAC34.230.96
5D963G6OA80.770.96
Average46.130.82
STD21.060.13

Random designs

In order to evaluate how much the process benefits from docking models, we performed the same design approach but using random sets of surface positions (still constraining average Cα distance <12 Å) instead of using designs optimized to disrupt specific docking models. Here surface was defined as a relative solvent accessibility greater than 7% (Mizuguchi et al., 1998). For each target, 1000 random triple mutants were generated and subsets were selected by the same clustering approach so as to match the number of plans used by EpiScope for that target. To account for effects of random variation, the process was repeated 1000 times for each target. On average, the success rates of plans using random designs were approximately 60%, compared to 85–88% for plans guided by docking (Figure 4—figure supplement 2). Random plans sufficed for some very small proteins (e.g., 3QWO, with 48 surface residues), yielding success rates reaching nearly 90%. On the other hand, for the moderately-sized 4KI5 (108 surface residues), EpiScope specified that it needed just a single design, which was indeed successful, though the random design approach success rate for this target was only 10%. In general, the substantially higher and consistent performance of docking-guided design, along with the guidance it provides regarding the number of variants that must be tested in order to cover all the hypotheses, demonstrates that docking does indeed provide valuable information about where and how to target mutations.

Mutations per design

To assess the trade-off between efficiency and precision of epitope localization, the number of mutations permitted per design was varied from 1 to 4 (Figure 4—figure supplement 3). With more mutations per design, fewer variants were required to sufficiently interrogate all docking models (Figure 4—figure supplement 3A), and an average of one design overlapped the true Ab epitope regardless of the number of mutations permitted (Figure 4—figure supplement 3B–C). However, the precision with which epitopes could be localized also decreased with increasing numbers of mutations per design (Figure 4—figure supplement 3D–E; Kendall’s τ: 0.37 and 0.30 for the number of docking models and residues respectively). Most significantly, the success rate of epitope localization was highest when using three mutations/design (Figure 4—figure supplement 2F), suggesting that the choice of triple mutants for the prospective application provided a good balance between low relative experimental effort, acceptable epitope resolution, and high success rate.

Inter-mutation distance

We next explored the relationship between allowed inter-mutation distance vs. resolution and success rate, since closer mutations could lead to more precise identification of the epitope but at the expense of actually hitting epitopes less frequently. We found that average Cα distances in our initial designs were generally 11 to 15 Å (Figure 4—figure supplement 4A). Double and triple mutation variants were then designed while systematically varying the distance cut-off in order to obtain different average inter-mutation distances. Single mutation variants were also designed in order to provide a baseline for comparison. For ease of interpretation in terms of the effects of the distance threshold, the algorithm was constrained to generate a single design of 1, 2, or 3 mutations with which to disrupt docking models.

The baseline success rate in hitting an epitope with a single optimized mutation was approximately 25% (Figure 4—figure supplement 4B); for reference, random single mutations hit epitopes about 14% of the time (not shown in the figure). Increasing the mutational load had a substantial impact, with the success rate jumping to 50% and 55% for just one double or triple mutant (Figure 4—figure supplement 4B–D). Spreading mutations out more than the initially selected 12 Å average did not improve the success rate, perhaps due to lack of coherence. Bringing them too close together likewise decreased the success rate, though we note that this observation is limited by the fact that there were not many designs available at shorter cut-offs (particularly 6 Å). Filtering docking models according to consistency with disruptive mutations left an average of about 12 models covering about 47% of the Ag surface in the baseline case of a single optimized mutation (an average of 5.6 docking models spanning 32% of the Ag surface for random single mutations, not shown in the figure). Double and triple mutants resulted in a few more models covering slightly more surface residues (16 and 17 models, and 52% and 56% of the surface at 12 Å), not sacrificing much in resolution in order to obtain their better success rates. Thus, while the cut-off did result in a trade-off between the resolution and the success rate, a 10–12 Å threshold seemed to provide the best balance.

Epitope definition

Finally, we considered the impact of the definition of ‘ground truth’ on the assessment of these retrospective tests. While the epitope definitions used so far were those deposited in the IEDB as experimentally verified, the larger set of ‘binding interface’ residues could also be considered. We analyzed our results in terms of such residues, as determined by an inter-heavy atom distance of 5 Å in the co-crystal structure. Table 6 details that, as would be expected, this broader definition yields improved hit rates, with only three targets missed using either crystal structures or homology models (compared to 4 and 5, respectively, for IEDB epitopes). As with the IEDB specification of epitopes, these additional results suggest that both crystal structures and homology models of the Ags may be sufficient to localize Ab:Ag binding. While mutations at binding interface positions may not completely disrupt Ab-Ag binding, they may be good enough to enable detection of binding reduction, particularly since EpiScope selects highly-disruptive mutations.

Table 6
Success rates with epitopes defined according to IEDB or according to contacts in the binding interface.

Success is indicated as ‘T’ and failure as ‘F’. In test cases colored blue, EpiScope failed to find IEDB epitopes but did find binding interface residues.

https://doi.org/10.7554/eLife.29023.025
TargetCrystal structureModel structureTargetCrystal structureModel structure
IEDBInterfaceIEDBInterfaceIEDBInterfaceIEDBInterface
1FE8TTFF3QWOTTTT
1FNSTTTT3RKDTTTT
1H0DFFTT4DN4TTTT
1LK3TTTT4DW2TTTT
1OAZFFFF4ETQTTTT
1OB1TTTT4G3YTTTT
1RJLTTTT4G6JTTTT
1V7MTTTT4I3STTTT
1YJDTTTT4JZJTTTT
2ARJTTTT4KI5TTTT
2VXQTTTT4L5FTTTT
2VXTTTTT4LVHTTFT
2XQBTTTT4M62TTFT
3D9ATTTT4NP4TTTT
3HI1FFFF4RGOFTTT
3L5XTTTT5D96TTTT
3MXWTTTTTotal29 (88%)30 (91%)28 (85%)

Computationally-driven epitope binning and localization of multiple Abs targeting the same Ag

As observed with B7H6, computational modeling and design can uncover similarities and differences in possible binding modes of multiple different Abs against the same Ag, enabling design of a panel of variants to simultaneously map all of the epitopes. We sought to evaluate how this performance would scale for a larger set of Abs. We considered 12 immunization-induced Abs previously found to target four epitopes (Figure 5A) on the D8 envelope protein of vaccinia virus (Sela-Culang et al., 2014; Matho et al., 2014), the active component in smallpox vaccines. This retrospective test set thus serves as proof of concept for extracting relevant binding information from Ab sequences in order to characterize humoral responses to immunization, while also mirroring Ab discovery/isolation efforts where multiple Abs against a single Ag are isolated at once.

Figure 5 with 3 supplements see all
A small set of Ag variants has the potential to simultaneously localize multiple Ab epitopes for a single Ag.

(A) Heat map of competitive binding data (Sela-Culang et al., 2014) for 12 antibodies directed against the vaccinia virus D8 protein, with the extent of cross-blocking ranging from 0.0 (white, no effect) to 1.0 (black, complete blocking). Colors in all panels refer to the four Ab groups identified by this competition assay (I: purple, II: blue, III: yellow, and IV: red). (B) Heat map of the overlap between ClusPro-generated docks for each pair of Abs, ranging from 60% (white) to 100% (black). (C) Heat map of the average Hausdorff distance between Ag variants designed for each Ab, ranging from 0 (identical mutation sites, black) to 12 (white). (D–F) Ag variants designed to disrupt one Ab from each group (I: JE11, II: CC7.1, III: EE11, IV: LA5) are represented as triangles. Four designs were sufficient to cover all docking models, and the designs overlapped all of the epitope groups. True epitopes are color coded by group on the surface of the antigen; epitopes in group II and III overlapped, and are colored in green. Design residues overlapping the true epitopes are indicated with circles. (E and F) Zoomed views of epitope faces.

https://doi.org/10.7554/eLife.29023.026

The docking models generated for the 12 different anti-D8 Abs were fairly indistinguishable (Figure 5B); in fact, the models for each Ab covered on average ~80% of the surface of the Ag (Figure 5—figure supplement 1), leaving little room for differentiation. However, quite strikingly, similarities between EpiScope-generated variants designed to disrupt the docking models revealed patterns among the Abs (Figure 5C) that were similar to those observed in experimentally determined competitive binding assays (Figure 5A). Thus the combination of docking and design elucidated amino acid level patterns of specificity driving Ab–Ag interactions. It bears noting that EpiScope was still able to identify epitope residues: in the crystal structure for the D8-LA5 (PDB ID 4ETQ), two designs are in the LA5 Ab binding regions (Figure 5—figure supplement 2).

Since Ab recognition of Ag is driven by the Ab complementarity determining regions (CDRs), Abs with very similar CDR sequences would be expected to have very similar epitopes. To explore the impact of CDR sequence similarity on the relationship among docking, disruptive design, and binning, we selected one Ab for each of the seven unique sets of CDR sequences represented among the 12 Abs (yielding JE11, AB12.2, BG9.1, EB2.1, EE11, JE10, and FH4.1). The experimental epitope binning results for this subset of Abs largely reflected heavy chain CDR (CDR-H) sequence similarity, and did not significantly depend on light chain CDRs (Figure 5—figure supplement 3). Thus we may conclude that for this set of Abs, the CDR-Hs largely drive binding. However, without such experimental data, it is not obvious how important each CDR is to the binding profile; e.g., there are certainly cases where light chains are very important for strong binding (Ko et al., 2015). Unfortunately, simply comparing overall CDR sequence similarity (‘all’ in Figure 5—figure supplement 3), as might be appropriate without any assumptions, yields a pattern that doesn’t reflect binning as well as that for individual CDR-Hs. More generally, it is not easy to predict how much variation in an individual CDR sequence will impact binding, or how to combine variation across multiple CDRs to assess an overall effect. On the other hand, EpiScope naturally integrates this sequence information into structural models, predictions of possible binding modes, and design of disruptive mutations. Thus EpiScope’s ‘binning’ pattern does reflect the experimental binning results.

With Abs binned into four separate groups, a panel of Ag variants could be designed to localize the epitope of each group based on a representative member; here, for the sake of testing we used the Ab from each group that had previously been structurally characterized (Matho et al., 2014). While a total of 18 designs would be required to localize each Ab independently (JE11: 5, CC7.1: 4, EE11: 4, LA5: 5), a multi-Ab panel of only four variants could simultaneously cover all 116 docking models from all 4 Abs (Figure 5D). Remarkably, each design contained one or two mutations overlapping the characterized binding interface of the representative Ab from a particular binning group, suggesting that the variants localized to meaningful epitope regions and may further serve as epitope probes with which to characterize or select new Abs with varying specificities in the future. This investigation thus demonstrated that not only is it possible to obtain epitope grouping information without experimental effort, but also that subsequent experimental effort can be greatly reduced while localizing multiple Ab epitopes. Designed Ag variants could be of further utility as probes to profile epitope specificities of polyclonal serum samples and to investigate correlates of vaccine protection or efficacy.

Purely computational prediction

To characterize potential results from purely computational epitope prediction, we applied to all of our targets a state-of-the-art predictor, the computational component of the integrated epitope localization method PEASE (Sela-Culang et al., 2014). Similar to EpiScope, PEASE accepts Ab sequences and Ag structures as inputs, but instead of explicitly docking Ab and Ag, it utilizes machine learning methods based on Ab-Ag binding interface characteristics to predict epitope ‘patches’ of 4–5 residues each. While the PEASE approach further incorporates the results of competition experiments to refine patch predictions for epitope-grouped antibodies, here we consider the accuracy of just the Ab-specific patches themselves. In characterizing PEASE results we used the ‘residue-score’ (RS) cutoff of 0.43, which was determined optimal in retrospective test cases (Sela-Culang et al., 2014), but we note that results were substantially affected by this value, adding a layer of complexity to the analysis. PEASE provides a ranking of patches, so to make a balanced comparison, we considered an equal number of top-ranked PEASE predictions to the number of variants designed by EpiScope.

For TZ47, none of the top 4 PEASE patches (covering 13 residues) contained residues proximal to the localized epitope, but for PB11 the top PEASE patch (five residues) contained at least two residues overlapping with mutations in disruptive chimera or EpiScope designs (Table 7). Over all of the retrospective test cases (Table 8), the top PEASE patches overlapped with epitope residues 52% of the time (compared to 88% with crystal and 85% with homology model for EpiScope designs). EpiScope succeeded in 14 cases where PEASE failed and PEASE succeeded in two cases where EpiScope failed. When the computation-only portion of PEASE was applied to the set of 12 VACV Abs, only ~40% of residues contained within the combined set of top predicted patches were part of Ab epitopes (Sela-Culang et al., 2014). When considering the top ranked PEASE patch prediction for each of the structurally characterized four representative Abs, only two epitopes were localized, one correctly (Group I epitope predicted for Group I Ab JE11) and one fortuitously (Group IV epitope predicted for group II Ab CC7.1). Furthermore, the same top patch prediction was returned for 3 out of the 4 Abs. However, residues comprising the binding interface overlapped with at least one of the top 11 PEASE patch predictions for each Ab, suggesting a potential benefit to incorporating PEASE predictions into the generation of hypotheses for EpiScope-directed experimental validation, focusing experimental effort on those patches that PEASE identifies as most important, either based on purely computational experimental analysis or by integration of prior experimental data.

Table 7
Comparison of residues predicted by PEASE for TZ47 and PB11 to mutations included in disruptive EpiScope designs.

Residue score cut-off 0.43 was used for PEASE.

https://doi.org/10.7554/eLife.29023.033
PatchPredicted patch residue positionsPatch scoreDisruptive EpiScope design mutation positions
TZ47-Patch 1158,159,160,161,1620.41154, 157, 217 (TZ47-Ag4)
TZ47-Patch2158,160,161,162,1630.4154, 157, 216 (MULTI-4)
TZ47-Patch31,29,30,31,320.4
TZ47-Patch41,2,30,31,1060.39
PB-Patch11,2,30,31,1060.4751, 52, 99 (PB-Ag2)
PB-Patch246,47,48,49,500.4157, 84, 98 (MULTI-1)
PB-Patch3158,160,161,162,1630.4
PB-Patch4195,196,197,198,2030.38
PB-Patch5123,124,125,126,1390.38
Table 8
Comparison of predictive components of PEASE and EpiScope on retrospective test set of 33 non-redundant Ab-Ag pairs.

The number of designs needed/considered indicates the number of designs generated by EpiScope to cover all ClusPro docking models. An equivalent number of the top ranked PEASE patch predictions are considered for each Ab. Coloring highlights the cases in which Episcope (green) or PEASE (red) succeeded where the other method failed. Grey coloring indicates cases in which both methods failed.

https://doi.org/10.7554/eLife.29023.034
TargetCrystal structure of agModeled structure of ag
# of Designs Needed/Considered# of EpiScopeDesigns Overlapping True Epitope# of PEASE patches Overlapping True Epitope# of Designs Needed/Considered# of EpiScopeDesigns Overlapping True Epitope# of PEASE patches Overlapping True Epitope
1FE8424300
1FNS525212
1H0D300210
1LK3310310
1OAZ202202
1OB1310424
1RJL313212
1 V7M610312
1YJD210310
2ARJ313313
2VXQ313313
2VXT313313
2XQB212313
3D9A310210
3HI1400300
3L5X616323
3MXW212212
3QWO323434
3RKD523312
4DN4210210
4DW2412311
4ETQ411311
4G3Y310423
4G6J312313
4I3S423412
4JZJ410640
4KI5110110
4 L5F210310
4LVH510602
4 M62210200
4 NP4310530
4RGO302613
5D96410410

Discussion

We have demonstrated that the combination of computational docking and computational protein design can explicate binding information implicitly encoded in Ab sequence and thereby drive efficient localization of epitopes. To our knowledge, EpiScope is the first method to directly optimize experimental validation of in silico epitope predictions, designing rich combinations of mutations in Ag variants to disrupt predicted Ab binding. We successfully localized epitopes prospectively for two Abs using only Ab sequence and Ag structure as inputs to the design process. This work thus significantly elaborates the recent push for integrated computational-experimental epitope mapping, which previously required initial incorporation of experimental data from competitive epitope binning assays (Sela-Culang et al., 2014) or neutralization assays of viral variants (Chuang et al., 2013) in order to generate a ranked list of predicted epitopes for experimental evaluation. In contrast, we show that starting with only sequence information, comprehensive experimental testing can be optimized to cover all epitope predictions.

Retrospective analysis bolstered the generality of the lessons from the successful prospective application, demonstrating an expected 88% success rate with only about three experiments – a high likelihood of successful epitope localization with minimal experimental effort. Notably, the design process itself determines how many experiments are required to test all the computational hypotheses. While docking itself was not sufficient to confidently identify epitopes, the information it provided was critical in driving the experiments, as performance suffered when using random size-matched designs instead of docking-based ones. In expansion to multiple Abs against a single Ag, the computational analysis integrated information about sequence and structural similarity into hypotheses about binding similarity, and by itself was sufficient to epitope bin 12 Abs targeting four different epitopes on a single Ag. By leveraging commonalities of putative epitopes, multi-Ab-targeting sets of Ag variant designs were comparable in size to those for single Abs, and were indeed sufficient to localize all epitopes simultaneously in both retrospective and prospective cases. Thus, this methodology may be useful for the translation of high-throughput Ab repertoire sequencing data into the identification and definition of clinically relevant epitopes. The ability to epitope bin in silico and localize multiple epitopes through the rational design of optimal Ag variant panels offers the potential to incorporate epitope diversity earlier into Ab development pipelines.

The integrated computational-experimental approach thus employs computational modeling to extract and exploit crucial features of molecular recognition encoded in the amino acid sequences of the Ab and Ag. By computationally focusing experimental effort, it offers potentially significant time and cost savings relative to purely experimental evaluation methods, and potentially significant accuracy improvements relative to purely computational prediction methods. It also strikes a balance between the relatively high-resolution localization provided by comprehensive experimental studies, and the relatively low-resolution information provided by high-throughput competition studies. We now discuss some of the impacts of relying on computation, the niche filled relative to experimentally-driven efforts, and the outlook for future developments and studies.

Limitations

In general, computationally-driven methods critically depend on the quality of inputs provided, the degrees of freedom allowed, and the algorithms employed. These factors impact both what is possible with computationally-driven epitope mapping and how well it performs in different scenarios.

Ab homology models

With the ever-increasing number of solved crystal structures, Ab homology modeling approaches routinely achieve Angstrom-level accuracy for framework regions and CDRs other than H3, for which state of the art is typically 1.5 ~ 3 Å (Marks and Deane, 2017). This level of accuracy was obtained here, despite limiting template identity, and consequently Ab model quality was not a major driving factor for success (Figure 6—figure supplement 1A and B). In fact, the best modeled Ab structure, 1FE8 (CDR-H3 RMSD: 0.63 Å), failed perhaps due to poor docking (fnat with the crystal Ag structure: 0.1, just above ‘acceptable’), whereas epitope overlapping positions were identified using the worst Ab model, 2XQB (CDR-H3 RMSD: 7.21 Å and fnat: 0.32, ‘medium’). In settings where Abs are harder to model (e.g., antibodies with very long CDR-H3s, or post-translationally modified CDRs), performance may suffer. While in theory the approach presented here may also apply equally well to alternative formats from other species and antibody-mimetics, in practice it depends on the quality of resulting models. It is possible that performance may be even better, for formats with more-constrained and more-easily-modeled binding regions.

Ag homology models

As with Abs, while modeling success on any specific Ag target depends very much on the availability of high-quality, well-matched templates, the continued expansion of structural databases and improvements in modeling algorithms have led to fairly routine Angstrom-level models (Moult et al., 2016). Here, even while again attempting to use only moderately-similar templates, the models were uniformly of high quality, and the quality did not appear to play a significant role in the results (Figure 6—figure supplement 1C and D). For example, the Ag structure of 4RGO was accurately predicted but failed, again perhaps due to poor docking (fnat: 0.16). However, the worst model, 4JZJ, still yielded a successful design possibly because the binding interface was still modeled sufficiently accurately to support docking (fnat: 0.31). While not observed here, if, compared to the model, the Ag undergoes substantial conformational changes affecting the epitope region, or if post-translational modifications interfere with (or even comprise) the epitope, epitope localization results may certainly suffer.

Ab:Ag docking models

As discussed in the introduction, docking generally produces an acceptable-quality model among the top set (Brenke et al., 2012). Furthermore, docking tends to perform better with crystal structures than with homology models (Rodrigues et al., 2013). We observed both phenomena here (Figure 6). This analysis also showed that docking model quality was the major factor driving success or failure of the approach. EpiScope was always able to identify epitopes for targets with good docking models, i.e., those above ‘medium’ according to the CAPRI definition (Lensink et al., 2007). While the failure cases all had poor docking models (e.g., Figure 6—figure supplement 2A), EpiScope did sometimes succeed even in cases with poor docking models (4I3S; fnat: 0.05). With continued improvement of docking methods, the ‘medium’ case may become the norm, but for any particular target, docking may fail; this is a particular risk if there are poorly modeled portions of the Ab or Ag, or if there is substantial conformational change upon binding.

Figure 6 with 2 supplements see all
Success of EpiScope and the quality of docking models.

In general, docking using Ag crystal structures is better than using Ag homology models according to the fnat value; it is above ‘medium’ for crystal structures but only ‘acceptable’ for model structures. Poor docking models are necessary, but not sufficient, for the failure of the EpiScope approach: EpiScope still identifies epitopes for some poorly docked models, but all failed cases have low fnat values.

https://doi.org/10.7554/eLife.29023.035

In addition to the quality of docking models, their number and diversity will also affect the results, as more variants may be required to cover more diverse models. By using ClusPro here, each target had only a relatively small (around 30) and diverse set of docking models clustered from a much larger initial set. There was still some correlation observable between the number of docking models and the number of final selected designs: a correlation coefficient (Kendall’s τ) of 0.397 for Ag crystal structures, and 0.443 for Ag homology models.

Ag mutations

Sets of mutations must be chosen to disrupt Ab binding while preserving Ag stability. While in general modeling the effects of mutations on binding and stability remains very challenging, the case here is perhaps the most benign scenario for both, in that we seek only to disrupt binding (not improve it) and maintain stability (not improve it) while mutating solvent exposed (not core) residues that are fairly well spread apart (typically not directly interacting). Thus, much as with alanine scanning, the chosen mutations can be expected to be relatively benign. For the prospective application, we showed that indeed the designed mutations tested did not destabilize the Ag in terms of eliminating its ability to bind its natural ligand. While we could not explicitly evaluate that for the retrospective studies, the molecular modeling method employed is one of many similar well-established techniques for predicting and designing for stability and has been successfully applied to other challenging cases such as mutation of hydrophobic core residues in T cell epitope deletion (Blazanovic et al., 2015; Salvat et al., 2015; Zhao et al., 2015).

By restricting mutations to those appearing among homologs, we leverage nature’s experiments to improve the chance of success, though at the cost of reducing the degrees of freedom to consider. This appears to have caused one of the failures, when the lack of homologous sequence information for portions of the Ag limited the mutational choices considered (Figure 6—figure supplement 2B–C). Alternative approaches could leverage structural modeling to fill in such gaps and even to expand the mutations considered throughout based on initial individual energy evaluations. Sequence and structural modeling could even be integrated in a Pareto optimal fashion (He et al., 2012) to balance reliance on both sources of information.

Computational cost

Given docking models, the design process proceeds through several steps, with the most computationally intensive being designing sets of disruptive mutations for each docking model independently, and clustering these designs to cover all models simultaneously. We used here a design algorithm based on integer linear programming, thereby generating provably optimal global designs rather than stochastically sampling. This guaranteed optimization approach comes at some computational cost, but given the relatively few degrees of freedom here, required less than a day per target on using 10 nodes on a cluster. The selection of designs covering all models was performed by a K-medoids clustering script followed by exhaustive testing of combinations across clusters. Both steps could be further optimized or optimality could be traded for efficiency, but the time required for computation is already much cheaper than that required for experiment.

Resolution

While experimental approaches, from traditional alanine-scanning mutagenesis to advanced techniques including the combination approach of comprehensive mutagenesis libraries and deep sequencing (Kowalsky et al., 2015), can provide unparalleled residue-level detail, they can require significantly more experimental effort than the computationally-directed approach. In addition, mutation of non-epitope residues can confer phenotypic changes in binding (Guinto et al., 1999; Dang et al., 1997), resulting in false assignment of these residues to the binding interface (Greenspan and Di Cera, 1999). EpiScope attempts to address these limitations by evaluating and optimizing the potential of individual residue changes to disrupt stability and Ab binding in predicted docking models, although there theoretically remains the potential for EpiScope-designed mutations to similarly misrepresent epitopes. Fine-grained epitope characterization is often a goal during late stages of Ab development after the number of promising candidates has been narrowed down to a manageable number for such time- and cost-intensive efforts. However, consideration of epitopes earlier during initial large-scale screens may enable selection for epitope diversity, and may be better served by the more efficient and less resource intensive, albeit less-detailed, characterization afforded by Episcope.

Chimeragenesis provides an alternative method to incorporate multiple potentially Ab-disrupting but stability-preserving mutations. However, designing suitable chimeras is difficult as is the interpretation of binding assay results, due to the complex relationship between linear recombination and spatial organization of epitopes. Anecdotally, we had tested eight chimeric variants of macaque and human B7H6 homologs in an attempt to localize the TZ47 epitope, but failed to do so before designing a 9th chimera (in Results) based on the EpiScope-designed disruptive variant. We note that the computational design was ‘turn key’ based on only Ab sequence and Ag structure and succeeded with the promised number of experiments. The designed Ag variants also contained fewer mutations on average than chimeras, providing greater resolution on the binding epitope and potentially greater expression fidelity.

EpiScope balances the level of detail obtained in epitope localization with the experimental effort required, using only a few variants but leaving the epitope only roughly defined. It provides additional docking information that is not captured by standard mutagenesis-based methods, of particular use for those seeking to engineer the epitope or paratope to create enhanced reagents. On the other hand, no single computationally generated docking model is necessarily correct (recall average fnat = 32%), so follow-up experiments would be necessary to achieve the level of resolution provided by experimentally driven efforts discussed above. As demonstrated by retrospective cases, EpiScope can significantly decrease the amount of subsequent effort required by filtering relevant Ag surface residues to the vicinity of affected computational docks. Results of focused experiments, such as alanine scanning, may then be used to further filter/predict the most native-like docking model. Alternatively, a subsequent computationally-driven round of targeted mutations could be optimized. Docking models would be concentrated on the region, disruptive mutations designed, and then variants selected so as to expand the set of identified epitope residues and more finely discriminate among the models. In summary, EpiScope requires minimal prior knowledge and experimental effort and most efficiently ensures the successful localization of Ab epitopes given Ab sequence and Ag structure as inputs to the design process.

Outlook

Future uses of computationally-directed epitope mapping may enable high-throughput Ab epitope binning and localization using a minimal set of Ag variants for large panels of Abs, offering opportunities for the profiling of polyclonal serum samples in various disease settings and/or earlier selection for epitope specificity in Ab discovery pipelines (Brooks et al., 2014). Collectively, such high-throughput epitope characterization combined with rapidly advancing B-cell isolation and NGS technology may enable insights into the development of humoral immunity contributing to health/disease, including investigations of which epitopes more/less commonly elicit Abs, are targeted by Abs at different stages of disease, correlate with clinical status, etc. Investigations of such critical questions could then inform immunogen design and vaccine strategies, or the de novo design of therapeutic Abs targeting functionally relevant epitopes to enable novel mechanisms of action. In summary, EpiScope utilizes Ab sequence-encoded binding information to offer a highly efficient epitope localization strategy to keep up with rapidly advancing Ag-specific B cell sorting and next generation sequencing efforts and offers the exciting potential to advance early Ab discovery and development efforts, evaluation of humoral responses in various disease/vaccination settings, and rational epitope-focused vaccine design.

Materials and methods

Computational method: EpiScope

As summarized in Figure 1, the computational design component of the EpiScope protocol generates representative Ab:Ag docking models, designs Ag variants so as to disrupt the various models, and selects a small set of those variants so as to ensure all (or as many as possible) of the docking models will be disrupted by at least one variant. These steps were instantiated here as follows.

Representative docking models were generated from the ClusPro webserver (Comeau et al., 2004b) in Ab mode with non-CDR masking (Brenke et al., 2012). All returned models were treated equally for subsequent analysis; depending on the target, this included from 13 to 30 representative docking models for the retrospective test sets (Figure 1A).

Ag variants were designed by a customized version of the EpiSweep protein redesign algorithm (Choi et al., 2017; Parker et al., 2013), modified to delete predicted Ab epitopes instead of predicted T cell epitopes (Figure 1B). Briefly, the protein design method selected from 1 to 4 mutations per Ag design predicted to be disruptive of Ab binding according to a docking model while simultaneously ensuring that the mutations would not be detrimental to Ag stability. Binding disruption was predicted according to the INT5 statistical potential (Pons et al., 2011), part of the SIPPER scoring function shown to be successful in protein docking benchmarks (Pons et al., 2011; Moal et al., 2013). A rotameric energy was used to predict effects of mutation on stability (Pearlman et al., 1995; Chen et al., 2009; Gainza et al., 2012). In order to restrict mutations to those most likely to be acceptable for Ag stability, a homology-based filter was employed (Choi et al., 2017; Parker et al., 2013), considering only evolutionarily-accepted variations appearing in homologous sequences found within 3 iterations of PSI-BLAST (e-value <0.001) (Altschul et al., 1997). Amino acids that were not predicted to disrupt any of the docking models (disruptive potential score <0) were removed from the list of choices.

After the generation of sets of mutations for each docking model, further filtration steps were performed. Any design whose rotameric energy was worse than the wild type was excluded, as was any whose mutations had average Cα distances >12 Å. Designs often overlapped with multiple docking models; only designed mutations that were disruptive to all of the docking poses in contact were included. In the case of identical positions with different mutations, the one with the most disruptive binding score was considered.

All remaining designs were then clustered using the K-medoids algorithm with the Hausdorff distance for the Cα coordinates of mutated positions. Intuitively, the Hausdorff distance assesses the distance between two sets of points by finding for each point in one set the closest neighbor in the other, and identifying the furthest neighbors. More precisely, if A has n points a1,a2,,an and B has m points b1,b2,,bm, then the Hausdorff distance h(A,B) between A and B is

h(A,B)=max{maxaA{minbB d(a, b)}, maxbB{minaA d(b, a)}}

K-medoids clustering was implemented using a script from the pyclust package (version 0.1.3, obtained from PyPI), modified to employ Hausdorff distance. K was started at one and was increased until any combination of designs, one from each cluster, was found to disrupt all of the docking models (Figure 1B). The set of designs with the most disruptive binding score was selected as the final set (Figure 1C). An initial implementation of EpiScope used for the prospective application employed an alternative approach to variant set selection, starting with a ‘centroid’ of all designs with addition of variants to maximize coverage of all docking models. Designs for both the prospective targets generated using the current implementation included disruptive mutations proximal to those generated with the previous implementation and validated experimentally in the results.

Structure preparation for B7H6

The crystal structure of unbound B7H6 was previously determined (PDB code: 3PV7) at 2.0 Å. There is a missing loop in the crystal structure (Chain A 150–157; DQVGMKEN). The loop was modeled using FREAD (Choi and Deane, 2010; Kelm et al., 2014) and a loop from 1O57:A 253–260 (STINMKEK) was found to be the best match. Anchor residues of the grafted loop (including backbone atoms) were minimized using the Tinker molecular dynamics package (Ponder, 2004)(ver. 6) using AMBER99sb (Hornak et al., 2006) with the GB/SA implicit solvent model (Still et al., 1990) while keeping the overall loop structure.

TZ47 is a murine Ab that binds to human B7H6. Its structure is unknown and thus the homology model from a previous study (Choi et al., 2015) was used. A model for PB11 was created using the PIGS modeling server (Marcatili et al., 2015; Marcatili et al., 2014). The model structure including backbone atoms was minimized using Tinker as described above. ClusPro in Ab mode with CDR masking generated 28 docking poses for each Ab.

Retrospective test set

Thirty-three Ab-Ag pairs with complex structures solved by X-ray crystallography (Table 3) were selected from SAbDab (Dunbar et al., 2014) according to the following criteria: pairwise Ab sequence identity <70%, pairwise Ag sequence identity <70%, resolution <3 Å, single-chained, and >50 and <300 residues in length. Structures missing any backbone atoms were excluded.

In order to consider practical applications in a realistic setting, all Ab structures were homology modeled in a manner established to simulate ‘hard-to-model’ situations, for which template sequence identity is less than 90% (Fasnacht et al., 2014). Abs were modeled using the PIGS webserver with restricted sequence identity templates (<80% to the target) followed by side chain energy minimization using Tinker as described above. The quality of the Ab models was extremely accurate (Table 4), both overall (TM-score: 0.95) and for CDRs (<2 Å for CDR-H3 and sub-Angstrom for others), consistent with a report on CDR-H3 modeling quality (Choi and Deane, 2011).

Ag targets were either crystal structures from the bound complexes, or homology models built by SWISS-MODEL (Biasini et al., 2014) with default parameters applied to templates. Again, in order to represent realistic application, templates were selected to keep sequence identity low. When possible, the sequence identity was restricted to be less than 50%, but for nine cases had to be increased due to lack of any targets at that cut off. The resulting average template sequence identity was 46%, ranging from 23% to 99% (Table 5).

Experimentally identified epitopes were obtained from the IEDB (Vita et al., 2015).

For the multiple Ab test set, 12 Abs were modeled from sequences (Sela-Culang et al., 2014) as described above. PDB code 4E9O:X was used for the vaccinia D8 structure. FREAD identified 2AZW:A 87–91 (SNHRQ) as the best match for the missing loop from 207 to 209 (SNHEG), and this structural fragment was grafted into the model as described above.

Antigen variant expression

Ag variants comprising complete extracellular and transmembrane B7H6 domains were ordered as gBlocks from Integrated DNA Technologies (IDT) and cloned into a HEK surface expression vector (pPPI4). Expression plasmids were transfected into HEK cells using polyethyleneimine as described previously (Choi et al., 2015). Briefly, HEK-293F cells at a density of 106 cells/mL were transfected with antigen variant plasmids at a concentration of 1.0 mg/L each and cultured for 2 days before conducting cell staining experiments and flow cytometry analysis.

Cell lines

HEK-293F cell line was purchased from ThermoFisher (Catalog #R79007). Cell lines were not verified or tested for mycoplasma contamination after purchase.

Fluorescent staining for flow cytometry

Fluorescent staining of cells was performed as described previously (Choi et al., 2015). Briefly, 96-well plates containing 2.5x105 cells/well of HEK cells expressing designed Ag variants were washed 3x with PBS + 0.1% BSA (PBS-F) before a primary incubation for 1 hr with 100 nM TZ47, PB11 scFv-Fc, NKp30-Ig, H48 (negative control Ms Ab), or PG9 (negative control Hu Ab). Cells were then washed 3x with PBS-F before incubation with either Anti-Mouse-AlexaFluor488 or Anti-Hu-AlexaFluor647 secondary Abs for 20 min. After a final wash with PBSF, cells were re-suspended in 200 μL PBSF +Propidium Iodide (PI) to stain for dead cells. Live cell gates were drawn based on FSC vs. SSC and negative PI staining and only this population was used for determination of AlexaFluor-488 or −647 signal. These cells were then separate into binding (fluorescence) positive or negative populations, as transient transfections generally include a population of HEK cells that did not uptake any plasmid. Relative integrated mean fluorescence intensity (I-MFI) values of binding positive cells were calculated as the product of (% of total cells) x (geometric mean fluorescence intensity). To normalize for varying expression levels introduced by differences in transfection efficiency, the normalized relative I-MFI for each Ab was calculated by dividing by the relative I-MFI of NKp30-Ig binding to each design. To normalize for WT binding, normalized relative I-MFIs for each variant were divided by the normalized relative I-MFI of WT B7H6.

Data availability

The EpiSweep software used as the basis for implementing EpiScope is available under an academic-use license and may be accessed at http://www.cs.dartmouth.edu/~cbk/episweep. The general method and detailed instructions for installing and using EpiSweep have been previously published (Choi et al., 2017; Parker et al., 2013). Additional inputs for EpiScope, enabling reproduction of the design results reported here, are provided in the supplementary, source data, and source code files that accompany the article.

References

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
    Docking and Scoring Protein Complexes: CAPRIProteins
    1. MF Lensink
    2. R Mendez
    3. SJ Wodak
    (2007)
    704–718, Docking and Scoring Protein Complexes: CAPRIProteins, Vol. 69, 3rd Edn.
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
    Antibody H3 Structure Prediction
    1. C Marks
    2. CM Deane
    (2017)
    Computational and Structural Biotechnology Journal 15:222–231.
    https://doi.org/10.1016/j.csbj.2017.01.010
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69
    TINKER: Software Tools for Molecular Design
    1. JW Ponder
    (2004)
    Saint Louis: Washington University School of Medicine.
  70. 70
  71. 71
  72. 72
  73. 73
  74. 74
  75. 75
  76. 76
  77. 77
  78. 78
  79. 79
  80. 80
  81. 81
  82. 82
  83. 83
  84. 84
  85. 85
  86. 86
  87. 87
  88. 88
  89. 89
  90. 90
  91. 91
  92. 92
  93. 93
  94. 94
  95. 95
  96. 96
  97. 97

Decision letter

  1. Max Vasquez
    Reviewing Editor; Adimab Inc., United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Computationally-driven Identification of Antibody Epitopes" for consideration by eLife. Your article has been reviewed by three peer reviewers, and the evaluation has been overseen by a Reviewing Editor and Arup Chakraborty as the Senior Editor. The following individual involved in review of your submission has agreed to reveal his identity: Timothy A Whitehead (Reviewer #1).

The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision, integrating all points from the individual reviews, to help you prepare a revised submission. We have also considered the possibility that this work may best be considered in our Tools and Resources (TR) category rather than as a Research Article (RA). That distinction may hinge on your responses as to why this method appears to work as that may contribute a fundamental understanding beyond the obvious value of your method.

Summary:

The manuscript describes a procedure, termed EpiScope, that combines structural modeling and limited experimentation to localize epitopes on an antigen of known, or inferred, 3D structure for one or more antibodies known to bind to said antigen.

The paper contains 5 main parts, most of them outlined in the Results section.

1) Description of EpiScope.

2) Prospective epitope localization for two antibodies against the target B7H6.a) Analysis of individual binders with two independent set of antigen mutant designs.b) Simultaneous analysis of both antibodies.

3) Retrospective analysis of 33 antibody: antigen complexes from the PDB.

4) Epitope binning and epitope localization for 12 antibodies to the D8 envelope protein of vaccinia virus.

5) Comparison of EpiScope results to the PEASE method (Sela-Culang et al., 2014). NB. This portion is presented in the Discussion, but given its nature (new results and data), it would make more sense to present it in Results.

Overall, the manuscript describes what could be a potentially significant step towards solution of an important problem with wide practical and fundamental implications. This method fills a nice niche between fine epitope mapping by comprehensive mutagenesis and deep sequencing, and rough epitope binning experiments conducted by expression and binding characterization of individual antigen domains.

The revision calls for additional detail and disclosure to establish more clearly the scope and limitations of the methodology, when applicable attempt an understanding of why the method seems to work as well as it does, and last, but not least, make method details more transparent as to enable interested independent researchers to reproduce the key aspects of this work. The items below, collected from the three independent reviewers, aim to specify the requests explicitly.

Essential revisions:

1) On the method, which may be described briefly as follows: Ab models are generated using the PIGS webserver, and docking decoys are generated using the rigid body docking software program ClusPro. In a key step, Ag variants are computationally designed to maintain stability and disrupt binding modes of a subset of the clustered decoys. These variants are constructed and experimentally characterized, and – as needed – follow-up experiments are designed to obtain fine epitopes. The last, key, step is not described in enough detail to allow an understanding, let alone set the stage for independent implementation, of the method.

2) The link provided in the transparent reporting file is currently a placeholder (as an aside, the referred to work, Choi et al., 2017, was first published online in late 2016; thus, unclear why the link remains a placeholder at this time); one can hope this would provide the lacking detail in the future, but right now it makes it harder to evaluate the work.

3) Nonetheless, some basic detail on the actual cases (see below) could help fill the gaps and provide real insight into how the method works. The authors should also disclose limitations of the method (more on this below), particularly the impact of errors in antibody modeling, conformational change in antigen upon binding (a factor not taken into account in the rigid-body docking approach of ClusPro), errors in ClusPro and docking procedures in general. Readers of the paper would benefit by a frank discussion of how far one can push this methodology in terms of accuracy of models for antigen: for example, choose Ag-Ab sets where Ag comparative models give average TM scores <0.82 (as many models do), and discuss the limitations in pure modeling approaches in determining accuracy of their method in retrospective examination; presumably it may be hard to assess whether mutations will disrupt Ag stability, when using a model for the Ag, a point the authors do not address.

4) In more detail: for example, docking of each of the two antibodies (TZ47 and PB11) to the antigen B7H6 resulted in 28 docking models, and yet only 4 and 5 triple mutants were needed to determine the epitopes. It is understood that the epitope is determined if a set of mutations disrupts binding. However, does the result mean that 27 of the 28 potential models were eliminated as binding outside the epitope? Or do the 28 docked models define only 4 or 5 distinct interfaces? Based on Figure 2 the latter could be the case. However, according to Figure 3—figure supplement 1, the docked models together define a very large fraction of the antigen surface. How can the 4 or 5 mutant designs cover this entire surface? Are the 3 mutations within each triple mutant far from each other to provide the coverage? If this is the case, how well is the final epitope defined? What fraction of the surface is covered by the docking models, and what surface fraction is the final epitope? These questions should be answered to have a better understanding of how and why the method works. Also, it is not clear how the three mutations in the triple mutants are placed relative to each other. Spelling out the sequences of all designed mutants tested could go a long way towards clarifying these issues (see below).

5) Further methodology questions that need to be addressed. For docking: (a) How are the representative docking models selected (centroid, lowest energy)? (b) Does the number of docking clusters influence the final number of designs? (c) Average computational time required for variant design should be reported, and (d) ClusPro option should be "non-CDR masking." For variant design and clustering: Which implementation of the k-medoids algorithm has been adopted? For mutant design choices, how are "related sequence information" (for the antigen) and energy considerations balanced? In the 1H0D example around Figure 4—figure supplement 3; it is said that "since the loop has no mutational information in closely related protein sequences (panel C), mutations that could disrupt binding are not considered in the design process." This could be a real limitation whenever such "related sequences" information is lacking or restricted for a given antigen of interest; one would think the energy-based criteria to choose mutations could compensate in such cases. This issue deserves some discussion, if anything, as avenue for future research.

6) On the prospective test for two B7H6 antibodies, the PDB code for the antigen is given, but not the sequences of the antibodies. It's possible this information is available for TZ47, as it has been described in the literature previously; this does not seem to be the case at all for the second antibody, PB11. To support the data in Figures 2 and 3, it is essential to spell out the 4 (for TZ47) and 5 (for PB11) B7H6 mutant designs; this could easily be included in a supplemental Excel file, for example. Data for the 4 TZ47-relevant designs are presented in Figure 2A-B-C, but not what they are in much detail. For PB11, data for only one design (of the 5 apparently made and tested) is shown in Figure 2D-E-F. In Figure 2E, binding of the Nkp30 ligand (used to monitor intact folding) appears reduced by about 5-fold relative to the wild type B7H6, and this is not explained or commented on by the authors. By contrast, the designs made to probe binding of TZ47 appear to maintain close to 100% of the WT binding to the control ligand. A key potential benefit of the method for mutant design is the expectation that it will flag positions (and associated mutations) most likely to disrupt binding by antibody, while least likely to affect antigen folding, stability and expression, as to enable a straightforward interpretation of a "loss of or reduced binding" readout. It is thus essential to address what may appear as first blush as a weakness in this aspect.

7) Similar disclosure is needed to support Figure 3: the exact sequences of the "integrated set of 6 (B7H6) designs" are essential part of the work. Having the data for all 6 (in the equivalent of Figure 3B) would also be informative and would allow a more transparent evaluation of the whole data set.

8) Last, in Figure 2—figure supplement 1B, binding of ligand to WT B7H6 appears bimodal; why is this and how it may impact interpretation of data obtained with the mutants?

9) On the analysis of 33 antibody: antigen complexes, characterization of antibody models should be made more in line with how these are evaluated in the literature (see for example issue 8 of volume 83 of Proteins on antibody modeling assessment). Minimally, the rms. deviations in CDRH3s need to be reported. It would also make the work significantly more "reader-friendly" if details like H3 sequences, and lengths, organism of origin of antibody, and so on would be added to Figure 4—table supplement 1 or related table in the CSV-formatted supplementary material.

10) Upon cursory analysis, it appears that 20 of the 33 antibodies in the set are of mouse origin, 2 from rat and 11 are human (or humanized). As antibodies (and perhaps binders in general using alternative scaffolds) of other species or origins (more unique synthetic designs, for example) become more prevalent, their structures will be harder to model, before the PDB "catches up," and this is a point worth mentioning in the paper.

11) Given that the docking poses tend to cover a large portion of each of the antigen's surface, it is hard to envision how the method can narrow down to a handful of proposed mutants that in close to 90% of the test cases include a residue that is part of the Ab: Ag interface. Working out at least one example in detail, including publication, as supplementary data, of the docked Ab: Ag models as PDB files and so on, could help illustrate how this level of success may be achieved. The fact the method does not suffer significantly when using moderate quality models of the antigen as well suggests some low-resolution feature of molecular recognition, as encoded in the respective amino acid sequences of the interacting partners, is in play here. This aspect requires more discussion.

12) On the selection of the 33 cases: using the same criteria adopted by the authors (70% max sequence identity for the antigen; 70% maximum sequence identity for the antibody; max resolution 3.0; protein antigens) in the generation of the retrospective test sets (with the exception the sequence length), one would have retrieved 76 complexes from the SabDab database. This is a larger list and about 2/3 of the structures in the test set of 33 are absent from the new query. Is this a function of database growth since the work was done or do some other filtering criteria need to be specified?

13) On epitope binning and localization of 12 antibodies to the D8 antigen. This section applies EpiScope to a case study from Sela-Culang et al., 2014. At first blush, some of the results are striking, but under closer examination of the sequences of the 12 antibodies (which, to make the paper more reader-friendly, should be disclosed in supplementary material), it is apparent some of the "predictions" of same epitope are rather non-surprising (see also Jia et al. JIM 2004, 288: 91-98; PMID: 15183088, for some background on this notion). Should anyone be surprised, for example, that antibodies AB12.2 and CC7.1 compete with one another, given they have identical CDR H3s and very similar sequences for the other CDRs? Similarly, for FH4.1 and LA5. Some of this result is hinted at in Figure 5—figure supplement 2 (and associated data), but it demands some effort from the reader to sort out. One should also note that the H3s of EB2.1 and JE10 have identical lengths and differ in just one position, while the H3s of BH7.2 and JA11.2 differ by just 2 – other CDRs in those pairs present high similarity as well. Not surprisingly, again, each of these pairs map to the same epitope group. Lastly, the structure of LA5 in complex with the D8 antigen was solved (PDB code 4EBQ), which may be interesting to comment on. The reality here is that of the 12 mAbs, only a fraction may be considered truly independent, after very simple examination of CDR3 sequences (HC and LC), and it may make sense to redo the analysis using an independent subset. The statement commented on below (subsection “Computationally-driven epitope binning and localization of multiple Abs targeting the same Ag”, end of third paragraph) could be modified or illuminated with this consideration.

14) On the comparison of EpiScope and PEASE for the different test cases. First, to reiterate, we suggest including this section in Results instead of Discussion. Second, and this could be in the Discussion, the authors should elaborate on the potential signal present in the "lower resolution" PEASE approach and on the statement "[…] suggesting a potential benefit to incorporating PEASE predictions into the generation of hypotheses for EpiScope-directed experimental validation."

https://doi.org/10.7554/eLife.29023.045

Author response

Summary:

The manuscript describes a procedure, termed EpiScope, that combines structural modeling and limited experimentation to localize epitopes on an antigen of known, or inferred, 3D structure for one or more antibodies known to bind to said antigen.

The paper contains 5 main parts, most of them outlined in the Results section.

1) Description of EpiScope.

2) Prospective epitope localization for two antibodies against the target B7H6.a) Analysis of individual binders with two independent set of antigen mutant designs.b) Simultaneous analysis of both antibodies.

3) Retrospective analysis of 33 antibody: antigen complexes from the PDB.

4) Epitope binning and epitope localization for 12 antibodies to the D8 envelope protein of vaccinia virus.

5) Comparison of EpiScope results to the PEASE method (Sela-Culang et al., 2014). NB. This portion is presented in the Discussion, but given its nature (new results and data), it would make more sense to present it in Results.

As suggested, we have moved the comparison to the Results section.

Overall, the manuscript describes what could be a potentially significant step towards solution of an important problem with wide practical and fundamental implications. This method fills a nice niche between fine epitope mapping by comprehensive mutagenesis and deep sequencing, and rough epitope binning experiments conducted by expression and binding characterization of individual antigen domains.

We thank the reviewers for the nice characterization of where and how we are contributing to the study of antibody:antigen recognition. We have incorporated into the Discussion the phrase describing the niche filled.

The revision calls for additional detail and disclosure to establish more clearly the scope and limitations of the methodology, when applicable attempt an understanding of why the method seems to work as well as it does, and last, but not least, make method details more transparent as to enable interested independent researchers to reproduce the key aspects of this work. The items below, collected from the three independent reviewers, aim to specify the requests explicitly.

We agree that it is very interesting that, in general, computational modeling and protein design seem to work as well as they do in leveraging key determinants of recognition in order to efficiently localize epitopes. We augmented our analysis of both the prospective and retrospective tests, as detailed below, in order to provide additional insights into when and why this is possible. We likewise certainly want to make the method transparent and widely useable, so as to enable a wide range of other such investigations, and have addressed the suggestions for how to do so.

Essential revisions:

1) On the method, which may be described briefly as follows: Ab models are generated using the PIGS webserver, and docking decoys are generated using the rigid body docking software program ClusPro. In a key step, Ag variants are computationally designed to maintain stability and disrupt binding modes of a subset of the clustered decoys. These variants are constructed and experimentally characterized, and – as needed – follow-up experiments are designed to obtain fine epitopes. The last, key, step is not described in enough detail to allow an understanding, let alone set the stage for independent implementation, of the method.

This is a nice overall description of the method. We agree that the design of follow-up experiments is not described in any detail; that is because this step is in fact not the focus of the work here, and would not contribute to testing the overarching premise that computational modeling and design can leverage information about Ab-Ag binding encoded in sequence in order to drive experiments that efficiently identify epitope residues. Thus, while additional, finer-grain information could be obtained by follow-up experiments, we focus here on the initial, coarse-grained characterization as a novel contribution to the study of Ab-Ag recognition, as was nicely described by the reviewers in the summary. We have augmented the Discussion (“resolution” subsection) with some further thoughts about how future investigations may pursue this topic.

2) The link provided in the transparent reporting file is currently a placeholder (as an aside, the referred to work, Choi et al., 2017, was first published online in late 2016; thus, unclear why the link remains a placeholder at this time); one can hope this would provide the lacking detail in the future, but right now it makes it harder to evaluate the work.

We have been responding to requests for that software by email rather than automatically by web form, since the demand for deimmunized proteins, the main use case for the platform, is not (yet) sufficiently high to be bothersome. We have updated the page accordingly.

3) Nonetheless, some basic detail on the actual cases (see below) could help fill the gaps and provide real insight into how the method works. The authors should also disclose limitations of the method (more on this below), particularly the impact of errors in antibody modeling, conformational change in antigen upon binding (a factor not taken into account in the rigid-body docking approach of ClusPro), errors in ClusPro and docking procedures in general. Readers of the paper would benefit by a frank discussion of how far one can push this methodology in terms of accuracy of models for antigen: for example, choose Ag-Ab sets where Ag comparative models give average TM scores <0.82 (as many models do), and discuss the limitations in pure modeling approaches in determining accuracy of their method in retrospective examination; presumably it may be hard to assess whether mutations will disrupt Ag stability, when using a model for the Ag, a point the authors do not address.

The retrospective tests illustrated some of these limitations, and we have further elaborated the discussion of our observations there, as well as more generally in a new “limitations” subsection in the Discussion.

a)We agree with the reviewers’ concern that errors in Ab modeling could detrimentally affect the results. We were not able to see that with these retrospective tests since, despite using turn-key methods, the models were all quite good (we have added quantification to the text). Consequently there were no discernable trends. In fact, one of the most accurately modeled Ab structures, 1FE8 (CDR-H3 RMSD: 0.63A) failed due to lower quality docking (fnat: 0.1, just above ‘acceptable’ docking quality), while one of the worst models (2XQB: CDR-H3 RMSD 7.21A) had higher quality docking and succeeded (fnat: 0.32, above ‘medium’ quality docking). We have elaborated this in the Discussion (“Limitations | Ab homology models”), and with the new Figure 6—figure supplement 1.

b) Likewise, we agree that if there is major conformational change in binding, the docking and thus epitope localization could be led astray. And more generally, an incorrect Ag model (cf. point d) could negatively impact how much signal can be extracted. It depends overall on how well the actual binding interfaces are accurately modeled (among other factors). We have added suitable discussion (“Limitations | Ag homology models”).

c) While successes and limitations of docking are well studied in the field in general, we have elucidated this point in the context of this study. In summary, we observe that docking quality is affected by model quality (crystal structures are better), and the epitope localization failure cases all had poor docks (as assessed by fnat). We have added Figure 6 and discussion (“Limitations | Ab:Ag docking models”).

d) The degree of freedom under our control here, and indeed observable in prospective situations, is the sequence identity between the query and the template. I.e., rather than attempting to somehow force the average TM-score to be worse, we more realistically chose templates of moderate sequence identity. When dealing with a model in practice, one wouldn’t know TM-score and would have to proceed (or not) on the basis of sequence identity. Thus, we feel that this test set gives a reasonable characterization of what one could expect in practice, and we have further justified this approach in the methods. We found that in this study, there was no significant observable impact on results from Ag modeling quality. As one specific example, epitopes of one of the most poorly modeled targets (4JZJ) were still identifiable. We have added discussion (“Limitations | Ag homology models”) and included Table 5 with the sequence identities to further help elucidate this point.

e) The reviewers are correct that there is not sufficient data for these test cases to assess impacts of mutations on antigen stability. And indeed that is not our contribution here – we use one of many well-established computational techniques for predicting and designing for stability; this one in particular has been successfully applied to other challenging cases such as mutation of hydrophobic core residues in our previous publications. Note that furthermore, our objective here is not to improve stability, but to ensure that the target not significantly destabilized while making putative binding-disruptive mutations. Perhaps mutagenesis for epitope identification is one of the best-case scenarios for such cases: the mutations are fairly far apart, at solvent-exposed positions, and using amino acids that appear in the sequence record. Thus the predicted effects are good, and even intuitively we would expect them to be benign (cf. Alanine scanning). We have added related discussion (“Limitations | Ag mutations”). Finally, we note that for the prospective tests, the experimental results show that the designed mutations did not destabilize the antigen to the point of either eliminating expression or binding of its natural ligand.

4) In more detail: for example, docking of each of the two antibodies (TZ47 and PB11) to the antigen B7H6 resulted in 28 docking models, and yet only 4 and 5 triple mutants were needed to determine the epitopes. It is understood that the epitope is determined if a set of mutations disrupts binding. However, does the result mean that 27 of the 28 potential models were eliminated as binding outside the epitope? Or do the 28 docked models define only 4 or 5 distinct interfaces? Based on Figure 2 the latter could be the case. However, according to Figure 3—figure supplement 1, the docked models together define a very large fraction of the antigen surface. How can the 4 or 5 mutant designs cover this entire surface? Are the 3 mutations within each triple mutant far from each other to provide the coverage? If this is the case, how well is the final epitope defined? What fraction of the surface is covered by the docking models, and what surface fraction is the final epitope? These questions should be answered to have a better understanding of how and why the method works. Also, it is not clear how the three mutations in the triple mutants are placed relative to each other. Spelling out the sequences of all designed mutants tested could go a long way towards clarifying these issues (see below).

These are a good set of related questions, and we have updated the text to make the method and the interpretation of the results clearer in that regard. We note that the main goal is to identify the general area of the Ag that is interacting with the Ab (in that mutations disrupt binding) – the docking models are a means to the end, not the end. So we do not try to eliminate all models, and in fact, there are multiple models consistent with the disruptive mutations. This is true even with a single identified mutation, which in our retrospective set (Figure 4—figure supplement 4; see below) would effect on average ~12 docking models (with footprints covering ~47% of the surface).

As the reviewer suggests, the mutations are not immediately adjacent to each other; in the main results as reported in the initial submission, they were limited to be on average (across the set of mutations in a design) 12Å apart or less. However, this distance varied in the different cases according to the selection algorithm. We elaborated the text and associated tables (Table 1 and 2) and files (Supplementary file 2 – sequences, and Supplementary file 3 – pymol session) for the prospective application to characterize the patterns of mutations, docking models, and footprints.

We further leveraged the retrospective tests to systematically explore impacts of distances and numbers of mutations on these characteristics, using just a single design (set of mutations) to enable direct assessment of the impact of the parameters, adding Figure 4—figure supplement 4 and related text (“Generalizability | inter-mutation distances”). First (panel (A)), we examined the actual distribution of inter-mutation distances in designs (under the constraint on their maximum), and found it to peak at 11~15Å for both two and three mutation designs. We then studied (panels B-D) the impact of varying the distance threshold, both on the success of hitting an epitope and on the “resolution” in terms of number of consistent docking models and number of residues in their footprints.

The success rate of the single mutation designs was approximately 25%. Using two or three mutations helps, achieving a 50-55% success rate for a single double or triple mutant. Spreading mutations further out than our initial 12 Å limit didn’t seem to help, and in fact could hurt (perhaps due to lack of localization). Bringing them too close together likewise decreased the success rate, though we note that there are few designs available at shorter cut-offs (particularly 6 Å). Lastly, the distance cut-off affects resolution in the way that might be expected – a larger cut-off yielded a larger “footprint”, both in terms of consistent docking models and in terms of their footprints on Ag surfaces. The trends were fairly linear. As noted above, even a single mutation was consistent with about 12 docking models hitting about 47% of the surface. The double and triple mutants hit a few more docking models and thus a bit more of the surface, but didn’t sacrifice too much above that baseline for the large increase in success rate mentioned above. While the cut-off does result in a trade-off between the resolution and the success rate, the threshold we used in the initial results (12Å) gave one of the best success rates at all mutational loads we have tested.

5) Further methodology questions that need to be addressed. For docking: (a) How are the representative docking models selected (centroid, lowest energy)? (b) Does the number of docking clusters influence the final number of designs? (c) Average computational time required for variant design should be reported, and (d) ClusPro option should be "non-CDR masking." For variant design and clustering: Which implementation of the k-medoids algorithm has been adopted? For mutant design choices, how are "related sequence information" (for the antigen) and energy considerations balanced? In the 1H0D example around Figure 4—figure supplement 3; it is said that "since the loop has no mutational information in closely related protein sequences (panel C), mutations that could disrupt binding are not considered in the design process." This could be a real limitation whenever such "related sequences" information is lacking or restricted for a given antigen of interest; one would think the energy-based criteria to choose mutations could compensate in such cases. This issue deserves some discussion, if anything, as avenue for future research.

We have further elaborated these details. The reviewers’ text included letters (a) and (b); we continued the pattern in order to match question and response.

a) Selection of representative docking models was handled by the ClusPro webserver for these results. By default, ClusPro generates ~70,000 models which are then clustered into ~30 representatives (in antibody mode). We used all the docking models provided by ClusPro, rather than selecting just the centroid or just the lowest energy one, and treated them all as equally plausible. We have clarified this matter in the text (“Methods | EpiScope”).

b) While the number of docking models of many targets was close to 30, there was enough variation to allow evaluation of the correlation between the number of docking models and the number of final selected designs. For crystal structures, the correlation coefficient (Kendall’s τ) was 0.397, and 0.443 for model structures (both p-values < 0.01). We have added to the Discussion (“Limitations | Ab: Ag docking models”) specifics from this experience along with more general thoughts about the relationship between the number of models and computational effort.

c) While the timing depends on many factors related to other discussion points (number of docking models, number of allowed mutations, etc.), and the parts specific to EpiScope are simply written as Python scripts and not tuned for speed, we note that the test targets all required only about a day (using 10 nodes on a cluster) for the key steps of generating and clustering the designs. We have added this to the Discussion (“Limitations | computational cost”) as part of the general characterization of trade-offs and limitations.

d) Right, we have fixed the terminology regarding our use of the “non-CDR masking” option.

e) The K-medoids implementation uses a single file from the pyclust package, which is based on the partitioning around medoids algorithm. We modified the script to deal with the Hausdorff distance. We have clarified this point in the text.

f) As is often done to limit the degrees of freedom in protein design and focus on those mostly likely to work (as nature has already evaluated them), the mutations considered are filtered a priori to be those appearing among homologs. There is subsequently no explicit balance between sequence statistics and energies, as it is assumed that accepted mutations are all okay if they are energetically favorable. We have elaborated this point in the text (“Limitations | Ag mutations”).

g) We agree that the selection of allowed mutations from homologs could be relaxed or even dropped in practice, e.g., by evaluating individual energies as suggested. The example the reviewer points to illustrates the limitation of adopting the filter, and we have added a brief discussion as suggested (“Limitations | Ag mutations”).

6) On the prospective test for two B7H6 antibodies, the PDB code for the antigen is given, but not the sequences of the antibodies. It's possible this information is available for TZ47, as it has been described in the literature previously; this does not seem to be the case at all for the second antibody, PB11. To support the data in Figures 2 and 3, it is essential to spell out the 4 (for TZ47) and 5 (for PB11) B7H6 mutant designs; this could easily be included in a supplemental Excel file, for example. Data for the 4 TZ47-relevant designs are presented in Figure 2A-B-C, but not what they are in much detail. For PB11, data for only one design (of the 5 apparently made and tested) is shown in Figure 2D-E-F. In Figure 2E, binding of the Nkp30 ligand (used to monitor intact folding) appears reduced by about 5-fold relative to the wild type B7H6, and this is not explained or commented on by the authors. By contrast, the designs made to probe binding of TZ47 appear to maintain close to 100% of the WT binding to the control ligand. A key potential benefit of the method for mutant design is the expectation that it will flag positions (and associated mutations) most likely to disrupt binding by antibody, while least likely to affect antigen folding, stability and expression, as to enable a straightforward interpretation of a "loss of or reduced binding" readout. It is thus essential to address what may appear as first blush as a weakness in this aspect.

a) Uploaded in Supplementary file 3.

b) Mutations for each design were added as Table 1. We have also included the mutations for PB-specific designs in this table. Multi-Ab designs are included in Table 2.

c) One EpiScope design was tested for PB11 – by the time we had added PB11 as a test case for EpiScope, we had actually tested PB11 binding to all TZ47-specific designs and Macaque-Human B7H6 Chimeras and had roughly localized its epitope to the N-terminal region of Domain I. Thus based on the four PB-specific designs generated by EpiScope, we cloned and tested the one design with mutations in that region.

d) The reviewers are correct in pointing out the impact on NKp30 binding of the disruptive PB11-Ag2 design as compared to the TZ47 designs, which may relate to the positions of the mutations within each design relative to the binding footprint of NKp30. The NKp30 binding region is very close to the mutations contained in the PB11-Ag2 design at the N-terminal region of Domain I (Figure 2D). Thus, although the mutations contained in the PB11-Ag2 design do not fall within the NKp30 binding footprint, it is possible that they had a regional effect in decreasing NKp30 binding. In contrast, the TZ47-specific designs were further away from the NKp30 binding region (Figure 2A). We recognize the confusion this causes in using NKp30 binding as a marker of stability/expression, and have clarified this in the manuscript.

7) Similar disclosure is needed to support Figure 3: the exact sequences of the "integrated set of 6 (B7H6) designs" are essential part of the work. Having the data for all 6 (in the equivalent of Figure 3B) would also be informative and would allow a more transparent evaluation of the whole data set.

The sequences for the integrated designs are uploaded in Supplementary file 3. The mutations are summarized in Table 2. Once again, only designs that contained mutations in epitope localized regions (based on binding data to other EpiScope designs and macaque-human chimeras) were created and tested.

8) Last, in Figure 2—figure supplement 1B, binding of ligand to WT B7H6 appears bimodal; why is this and how it may impact interpretation of data obtained with the mutants?

We are unsure as to the origin of this bimodal behavior, though it has been reported/observed previously. If we were to speculate, the RMA-B7H6 cells used included both adherent and suspension cells, which may differ in their clustering of surface proteins including B7H6. Because NKp30 is more reliant upon avidity than TZ47 and PB11, it is possible that the bimodal distribution results from either the adherent or suspension population having greater clusters of B7H6 on their surface, enabling greater NKp30-binding.

9) On the analysis of 33 antibody: antigen complexes, characterization of antibody models should be made more in line with how these are evaluated in the literature (see for example issue 8 of volume 83 of Proteins on antibody modeling assessment). Minimally, the rms. deviations in CDRH3s need to be reported. It would also make the work significantly more "reader-friendly" if details like H3 sequences, and lengths, organism of origin of antibody, and so on would be added to Figure 4—table supplement 1 or related table in the CSV-formatted supplementary material.

These model details are now provided in Table 4; implementation details are now in the Materials and methods. Note that the IMGT definition was used for CDRs, and RMSDs were evaluated using all backbone atoms (N, C, Cα and O). Accuracy ranges were similar to those reported in [Choi and Deane, Molecular Biosystems 7.12 (2011): 3327-3334] -- sub-Å for non CDR-H3 loops and ~2Å for CDR-H3.

10) Upon cursory analysis, it appears that 20 of the 33 antibodies in the set are of mouse origin, 2 from rat and 11 are human (or humanized). As antibodies (and perhaps binders in general using alternative scaffolds) of other species or origins (more unique synthetic designs, for example) become more prevalent, their structures will be harder to model, before the PDB "catches up," and this is a point worth mentioning in the paper.

We have now incorporated further discussion of the general state of antibody modeling (“Limitations | Ab homology models”), including extension to those from other origins and alternative antibody-like protein binders that may be harder (or indeed may be easier).

11) Given that the docking poses tend to cover a large portion of each of the antigen's surface, it is hard to envision how the method can narrow down to a handful of proposed mutants that in close to 90% of the test cases include a residue that is part of the Ab: Ag interface. Working out at least one example in detail, including publication, as supplementary data, of the docked Ab: Ag models as PDB files and so on, could help illustrate how this level of success may be achieved. The fact the method does not suffer significantly when using moderate quality models of the antigen as well suggests some low-resolution feature of molecular recognition, as encoded in the respective amino acid sequences of the interacting partners, is in play here. This aspect requires more discussion.

We discussed the overarching question of resolution in the response to point 4, above. We completely agree that the study here reveals that modeling and design can indeed extract and leverage “low-resolution features of molecular recognition, as encoded in the respective amino acid sequences of the interacting partners” (another nice phrase we thank the reviewer for and have incorporated into the Discussion). To illustrate in detail, we have provided in Supplementary file 1 a pymol session file that includes all 13 docking models and associated designs for the example in Figure 1. We also have included Supplementary file 2 for all docking models and designs for the B7H6 prospective test.

12) On the selection of the 33 cases: using the same criteria adopted by the authors (70% max sequence identity for the antigen; 70% maximum sequence identity for the antibody; max resolution 3.0; protein antigens) in the generation of the retrospective test sets (with the exception the sequence length), one would have retrieved 76 complexes from the SabDab database. This is a larger list and about 2/3 of the structures in the test set of 33 are absent from the new query. Is this a function of database growth since the work was done or do some other filtering criteria need to be specified?

We initially had 70 complexes, so indeed the database seems to have grown by 6. There were a few additional quality filtering criteria that were accidentally dropped from the text – we require a single Ag chain with no missing backbone atoms, so that homology modeling would not impact the results. Furthermore, we limited Ag length to be between 50 and 300 residues. We have specified these criteria in the revised manuscript (“Methods | Retrospective test sets”).

13) On epitope binning and localization of 12 antibodies to the D8 antigen. This section applies EpiScope to a case study from Sela-Culang et al., 2014. At first blush, some of the results are striking, but under closer examination of the sequences of the 12 antibodies (which, to make the paper more reader-friendly, should be disclosed in supplementary material), it is apparent some of the "predictions" of same epitope are rather non-surprising (see also Jia et al. JIM 2004, 288: 91-98; PMID: 15183088, for some background on this notion). Should anyone be surprised, for example, that antibodies AB12.2 and CC7.1 compete with one another, given they have identical CDR H3s and very similar sequences for the other CDRs? Similarly, for FH4.1 and LA5. Some of this result is hinted at in Figure 5—figure supplement 2 (and associated data), but it demands some effort from the reader to sort out. One should also note that the H3s of EB2.1 and JE10 have identical lengths and differ in just one position, while the H3s of BH7.2 and JA11.2 differ by just 2 – other CDRs in those pairs present high similarity as well. Not surprisingly, again, each of these pairs map to the same epitope group. Lastly, the structure of LA5 in complex with the D8 antigen was solved (PDB code 4EBQ), which may be interesting to comment on. The reality here is that of the 12 mAbs, only a fraction may be considered truly independent, after very simple examination of CDR3 sequences (HC and LC), and it may make sense to redo the analysis using an independent subset. The statement commented on below (subsection “Computationally-driven epitope binning and localization of multiple Abs targeting the same Ag”, end of third paragraph) could be modified or illuminated with this consideration.

This dataset is the one Sela-Culang and colleagues used in their demonstration of the new paradigm of antibody-specific epitope prediction, and in fact they integrated the binning data into epitope prediction by combining predicted antigenic “patches” for the antibodies within a bin, thereby exploiting this seemingly redundant information from highly similar antibody sequences to improve predictions. So we followed their lead in looking both at the relationships within and between bins, and were pleased that the Episcope “binning” results already largely reflected the observed ones, and might be complementary in additional ways. It is true that CDR sequences drive CDR structures which drive Ab-Ag binding, and thus binding similarity can often be largely explained by CDR sequence similarity. However, it is of course not always obvious how strong an inference about binding similarity can be made from sequence similarity or even overall structural similarity. For example, while it appears that here heavy chain CDRs largely drive the binding (by comparison to binning patterns, as suggested by the reviewer), there are certainly cases where light chain ones do [Ko et al., PloS one 10.7 (2015): e0134600]. And likewise, there are different degrees of similarity among the different CDR-Hs. As a result it is not clear how to combine them to infer impacts, or to identify which CDRs matter and why, or to define what the impacts of mutations on binding. These are all explicitly modeled and dealt with in a consistent fashion in our approach. We have added results and discussion to that effect (“[…] multiple Abs targeting the same Ag”).

Following the suggestion for more independence among representative antibodies, we reanalyzed the data using a subset of 7 of the 12 antibodies with no identical CDR sequences, and replaced Figure 5—figure supplement 3 with the new analysis both with EpiScope and with sequence analysis alone. (We note that Figure 5 was also updated to have the correct labels on the x axis of the matrix, which were in reverse order before.) While CDR-H sequence similarities still largely explain the binning similarity, EpiScope does appear to pull out additional information. EpiScope integrates structural modeling of the Ab, along with Ab-Ag docking and Ag design to disrupt putative binding, thereby systematically integrating all the relevant information and proposing hypotheses for experimental evaluation. We have further elaborated this analysis in the text (“[…] multiple Abs targeting the same Ag”).

We have added Figure 5—figure supplement 2 to the manuscript as the reviewer suggested. Please note that 4EBQ is not a complex structure but 4ETQ is. There are five designs generated for D8-LA5 using EpiScope and two of them are in the Ab-Ag binding interface.

14) On the comparison of EpiScope and PEASE for the different test cases. First, to reiterate, we suggest including this section in Results instead of Discussion. Second, and this could be in the Discussion, the authors should elaborate on the potential signal present in the "lower resolution" PEASE approach and on the statement "[…] suggesting a potential benefit to incorporating PEASE predictions into the generation of hypotheses for EpiScope-directed experimental validation."

We have moved the comparison to the Results.

We have also briefly discussed (end of that section) how PEASE predictions may complement EpiScope, by further focusing docking and design efforts on those regions likely to be fruitful, either as determined by purely in silico modeling, or by integrating additional prior data.

https://doi.org/10.7554/eLife.29023.046

Article and author information

Author details

  1. Casey K Hua

    1. Thayer School of Engineering, Dartmouth College, Hanover, United States
    2. Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth College, Lebanon, United States
    Contribution
    Conceptualization, Data curation, Funding acquisition, Investigation, Methodology, Writing—original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4420-7228
  2. Albert T Gacerez

    Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth College, Lebanon, United States
    Contribution
    Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  3. Charles L Sentman

    Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth College, Lebanon, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Writing—review and editing
    Competing interests
    No competing interests declared
  4. Margaret E Ackerman

    1. Thayer School of Engineering, Dartmouth College, Hanover, United States
    2. Department of Microbiology and Immunology, Geisel School of Medicine, Dartmouth College, Lebanon, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing—original draft
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4253-3476
  5. Yoonjoo Choi

    Department of Biological Sciences, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
    Contribution
    Conceptualization, Data curation, Software, Funding acquisition, Investigation, Methodology, Writing—original draft
    For correspondence
    yoonjoo.choi@kaist.ac.kr
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9687-8093
  6. Chris Bailey-Kellogg

    Department of Computer Science, Dartmouth College, Hanover, United States
    Contribution
    Conceptualization, Supervision, Funding acquisition, Investigation, Methodology, Writing—original draft
    For correspondence
    cbk@cs.dartmouth.edu
    Competing interests
    Dartmouth faculty and a co-member of Stealth Biologics, LLC, a Delaware biotechnology company. This author acknowledges that there is a potential financial conflict of interest related to his associations with this company, and he hereby affirms that the data presented in this paper is free of any bias. This work has been reviewed and approved as specified in Chris Bailey-Kellogg's Dartmouth conflict of interest management plans.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1860-0912

Funding

National Institutes of Health (R01 GM098977)

  • Chris Bailey-Kellogg

National Research Foundation of Korea (2016H1D3A1938246)

  • Yoonjoo Choi

National Science Foundation (CNS-1205521)

  • Chris Bailey-Kellogg

National Institutes of Health (5F30 AI122970-02)

  • Casey K Hua

National Institutes of Health (1R01AI102691)

  • Margaret E Ackerman

Center of Biomedical Research Excellence (8P30GM103415)

  • Charles L Sentman
  • Margaret E Ackerman

Allan U. Munck Education and Research Fund at Dartmouth

  • Charles L Sentman
  • Margaret E Ackerman

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. Max Vasquez, Adimab Inc., United States

Publication history

  1. Received: May 26, 2017
  2. Accepted: December 2, 2017
  3. Accepted Manuscript published: December 4, 2017 (version 1)
  4. Version of Record published: December 21, 2017 (version 2)

Copyright

© 2017, Hua et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,732
    Page views
  • 346
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, Scopus, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Cell Biology
    2. Structural Biology and Molecular Biophysics
    Xiao-Man Liu et al.
    Research Article
    1. Structural Biology and Molecular Biophysics
    Andrew J Borst et al.
    Research Article Updated