Abstract
Accurate prediction of the structurally diverse complementarity determining region heavy chain 3 (CDR-H3) loop structure remains a primary and long-standing challenge for antibody modeling. Here, we present the H3-OPT toolkit for predicting the 3D structures of monoclonal antibodies and nanobodies. H3-OPT combines the strengths of AlphaFold2 with a pre-trained protein language model, and provides a 2.24 Å average RMSDCα between predicted and experimentally determined CDR-H3 loops, thus outperforming other current computational methods in our non-redundant high-quality dataset. The model was validated by experimentally solving three structures of anti-VEGF nanobodies predicted by H3-OPT. We examined the potential applications of H3-OPT through analyzing antibody surface properties and antibody-antigen interactions. This structural prediction tool can be used to optimize antibody-antigen binding, and to engineer therapeutic antibodies with biophysical properties for specialized drug administration route.
eLife assessment
This paper presents H3-OPT, a deep learning method that effectively combines existing techniques for the prediction of antibody structure. This work, supported by convincing experiments for validation, is important because the method can aid in the design of antibodies, which are key tools in many research and industrial applications.
Significance of findings
important: Findings that have theoretical or practical implications beyond a single subfield
- landmark
- fundamental
- important
- valuable
- useful
Strength of evidence
convincing: Appropriate and validated methodology in line with current state-of-the-art
- exceptional
- compelling
- convincing
- solid
- incomplete
- inadequate
During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments
Introduction
Antibodies protect the host by recognizing and neutralizing foreign microbes or viruses and provide immunity against future infections. Furthermore, the use of therapeutic antibodies is steadily increasing in clinical treatments for infectious diseases, cancers, and autoimmune disorders1,2. For example, anti-PD-L1 antibodies block PD-L1/PD-1 interactions to inhibit tumor growth in patients3. A typical monoclonal antibody (mAb) is composed of heavy and light chains, while a nanobody (Nb) has only a single-domain variable heavy chain. The CDR heavy chain 3 (CDR-H3) loop is the most variable region in both length and amino acid sequence, and it plays a central role in antigen binding for both mAbs and Nbs. The development of effective therapeutic antibodies requires the solved structures of candidate antibodies, including the CDR-H3. However, it is both costly and labor intensive to experimentally obtain structures, and thus computationally predicted structures are often used to guide antibody design4. Despite major advances in computational methods, structure prediction for CDR-H3 loops remains challenging5.
Traditional template-based modeling methods, such as RosettaAntibody6, PIGS7, ABodyBuilder8, and MODELLER9, often provide demonstrably inaccurate predictions for CDR-H3 sequences10–12, and as a result, alternative artificial intelligence (AI)-based approaches, such as AlphaFold2 (AF2)13, trRosetta14 and RoseTTAFold15, are increasing in popularity. AF2 has shown comparable accuracy to experimentally determined structures by capturing physical and biological information about protein folding, thus providing a versatile deep learning framework for structure prediction. Structure prediction methods using pre-trained protein language models (PLMs), e.g., HelixFold-Single16, OmegaFold17, and ESMFold18, have shown comparable performance to AF2 with accelerated prediction speed. PLMs can be trained with datasets comprising tens of millions of unlabeled protein sequences in a self-supervised manner, and can be subsequently applied to a variety of downstream tasks, such as druggable protein target prediction19, predicting protein function, and protein design20–22. Antibody-specific tools such as IgFold23, tFold-Ab24, DeepAb25 and NanoNet26 have also been developed to improve accuracy in CDR-H3 prediction. Among them, IgFold leverages sequence representations from PLMs to efficiently predict antibody structures within seconds, and notably, IgFold can provide accuracy comparable to AF2, enabling high-throughput prediction of antibody structures.
In this study, we present H3-OPT, which combines features of AF2 and PLMs to predict antibody structures. We compare H3-OPT with several other antibody structure prediction methods and found that it can provide a lower average RMSDCα for CDR-H3 loops than other algorithms in three subsets of varying difficulty. To further validate our model, we experimentally solved the structures of three anti-VEGF nanobodies predicted by H3-OPT27. We examined the potential applications of H3-OPT through analyzing antibody surface properties and antibody-antigen interactions. We demonstrate the informative value of high-quality H3 loops for predicting binding affinity, and further support the use of H3-OPT as a powerful and versatile tool for studying antigen-antibody interactions. This structural prediction tool can be used to optimize antibody-antigen binding, and to engineer therapeutic antibodies with biophysical properties.
Results
The H3-OPT workflow
Prior to training H3-OPT, we first conducted a thorough evaluation of currently available antibody structure prediction tools, including AF2, RoseTTAFold, trRosetta, NanoNet, DeepAb, MODELLER, and AbodyBuilder (Supplementary Section 1). AF2 provided high-accuracy predictions for the overall structures of both mAbs and Nbs, with template modeling scores (TM-scores) and global distance test scores (GDT scores) greater than 0.9. Based on the relatively higher accuracy of AF2 in structural predictions, its prediction was used to extract structural features of CDR-H3 loops for further optimization by H3-OPT. Using a mAb/Nb sequence and its Fv model structure from AF2 as input, H3-OPT was then used to generate a refined structure. As data quality has large effects on prediction accuracy, we constructed a non-redundant dataset (sequence identity < 0.8) with 1286 high-resolution (<2.5 Å) antibody structures from SAbDab28 (Fig. 1a). The dataset was then divided into training, validation, and test sets based on amino acid sequence identity, which was done by using the UCLUST29 software. To assess the prediction results, the test set was split into 3 subgroups according to the differences in AF2 accuracy (RMSD, a measure of the difference between the predicted structure and an experimental or reference structure) stemming from the length of CDR-H3 sequence: easy-to-predict targets (0 - 2 Å, Sub1); moderate difficulty targets (2 - 4 Å, Sub2); and challenging-to-predict targets (> 4 Å, Sub3), with average CDR-H3 loop sequence lengths of 9.12, 11.08, and 16.43, respectively.
The workflow of H3-OPT is depicted in Fig. 1b. H3-OPT consists of a Template module and a PLM-based structure prediction module (PSPM). The Template module determines whether to use PSPM to optimize CDR-H3 and comprises two sub-modules: a confidence-based module (CBM) and a template-grafting module (TGM). The CBM calculates an AF2 confidence score to evaluate the reliability of CDR-H3 loop sequence inputs through MSA and template searching. The TGM then identifies a template from the H3 template database and grafts the CDR-H3 loop onto the AF2 model if such a template is available (Fig. 1c).
The PSPM contains a series of attention-based components to ensure that the complete structural context is extracted from the input sequences. Specifically, the PSPM employs a row-wise gated attention layer that updates residue-level information and exchanges information within residue pair representations and residue features to infer relationships between the spatial and residue representations. Additionally, by leveraging sequence-level representations from PLMs along with residue-level information, the PSPM can integrate high-information features, such as protein folding patterns, across a vast protein space to predict final three-dimensional (3D) atomic coordinates (Fig. 1d).
To better understand the contribution of each module to H3-OPT accuracy with varying factors, we next examined how incorporating different approaches through the Template and PSPM modules could improve accuracy of H3-OPT. While AF2 showed markedly stronger predictions than IgFold in Sub1 (Fig. 2a), IgFold provided higher accuracy in Sub3 (Fig. 3a). In light of previous studies that showed PLM-based models are computationally more efficient but have unsatisfactory accuracy when high-resolution templates and MSA are available18, we therefore sought to combine the strengths of AF2 with PLM-based modeling. Similar to IgFold23, an initial version of H3-OPT lacking the CBM could not replicate the accuracy of AF2 in Sub1. Interestingly, we observed that AF2 confidence score of CDR-H3 shared a strong negative correlation with Cα-RMSDs (Pearson correlation coefficient =-0.67 (Fig. 2b), which led to us to hypothesize that AF2 models with high confidence scores might be sufficiently accurate, and therefore do not require further optimization. Following this notion, we incorporated a CBM sub-module that could directly retain high-confidence structures from AF2. TGM sub-module was added as a means of grafting identical H3 loop templates from public PDBs onto AF2-predicted models to further improve accuracy. Ablation studies in which the CBM or TGM were excluded to determine their respective contributions to the final predictive accuracy (Fig. 2c-f) revealed that H3-OPT without the Template Module could generate structures with average Cα-RMSDs of 1.76 Å, 2.76 Å, and 4.86 Å for the Sub1, Sub2, and Sub3 datasets, respectively.
Incorporating only the CBM into our model significantly improved RMSDCα score by 0.62 Å for Sub 1, but exerted negligible effects on Sub2 and Sub3. By contrast, including the TGM alone resulted in substantially improved RMSDCα for Sub2 and Sub3 by 0.68 Å and 1.04 Å, respectively, suggesting that templates were more effective for relatively long CDR-H3 antibodies. These results suggested that the combination of TGM and CBM modules could leverage available templates to improve prediction accuracy.
Since the large majority of sequences in Sub2 and Sub3 have long CDR-H3 loops with few sequence homologs, attaining high accuracy in structural predictions becomes increasingly challenging for AF230. Inspired by IgFold and other PLM-based methods (Fig 3a), we thus developed a PSPM module to capture structural information from the sequence embeddings of PLMs31. The key innovations of the PSPM for our workflow were the integration of sequence-level representations from PLMs and the simplified architecture of AF2. ProtTrans-T5, AntiBERTy, and ESM2 were initially used without fine-tuning, resulting in overall average Cα-RMSDs of 2.40 Å, 2.49 Å, and 2.32 Å, respectively (Table 1). To improve accuracy, we employed a fine-tuning approach for the downstream CDR-H3 structure prediction task. As ESM2 outperformed other PLMs in our test set, we fine-tuned parameters of all ESM2 hidden layers, which resulted in an overall RMSDCα of 2.24 Å for H3-OPT. It should be noted that most computational models, such as IgFold, froze PLM weights during model training23. Analysis of test subsets showed that H3-OPT could provide high accuracy in sidechain predictions, high template modeling scores (TM-scores), and high global distance test scores (GDT scores) in Sub2 and Sub3 (Fig. 3b-c).
To expand its generalizability, we simplified the Evoformer architecture and introduced parameter-sharing to directly predict Cartesian coordinates of CDR-H3 loops through both residue-level features and pair representations. In residue-level representations, rows represented amino acid types and structural features derived from AF2, while columns represented individual residues from the input sequence. The pair representations contained information about the residue pairs, such as Cα-Cα distance. These residue-level representations were updated by the attention-based layers and continuously communicated with pair representations; the pair representations were then updated via triangular multiplicative layers30. This simplified architecture improved model efficiency and avoided overfitting. We also applied a structural alignment strategy for feature extraction and model prediction. Given the high conservation of Fv structures, the alignment strategy was designed to effectively capture residue contribution to loop folding, resulting in improved training speed and accuracy. Taken together, H3-OPT provided high-accuracy predictions for CDR-H3 loops, facilitating the development of antibody therapeutics.
H3-OPT provides higher accuracy CDR-H3 models than existing structure prediction software
We then evaluated the performance of H3-OPT against currently available methods, including AF2, IgFold, HelixFold-Single, ESMFold, and OmegaFold. H3-OPT achieved an average Cα root mean square deviation (RMSDCα) of 2.24 Å for CDR-H3, while AF2 and IgFold had CDR-H3 RMSDs of 2.85 Å and 2.87 Å, respectively (Fig. 4a). HelixFold-Single, OmegaFold, and ESM-Fold generated comparatively poor predictions, with RMSDs for CDR-H3 of 3.39 Å, 3.75 Å, and 4.23 Å, respectively. As shown in Fig. 4b, H3-OPT provided higher accuracy predictions than other methods in all 3 subgroups, with average RMSDs of 1.10 Å, 2.28 Å and 3.99 Å, respectively. While the predictions of H3-OPT were comparable to AF2 in Sub1, its accuracy was higher than AF2 in Sub2 and Sub3.
In light of these results, we then sought to validate H3-OPT using three experimentally determined structures of anti-VEGF nanobodies, including a wild-type (WT) and two mutant (Mut1 and Mut2) structures, that were recently deposited in Protein Data Bank (Fig. 4c).
Although Mut1 (E45A) and Mut2 (Q14N) shared the same CDR-H3 sequences as WT (LengthCDR-H3 = 17), only minor variations were observed in the CDR-H3. H3-OPT generated accurate predictions with Cα-RMSDs of 1.510 Å, 1.541 Å and 1.411 Å for the WT, Mut1, and Mut2, respectively (The confidence scores of these AlphaFold2 predicted loops were all higher than 0.8, and these loops were accepted as the outputs of H3-OPT by CBM). Subsequent comparison with IgFold, showed that AlphaFold2 outperformed IgFold on these targets and IgFold could not accurately predict the short helix in CDR-H3, resulting in Cα-RMSDs of 2.776 Å (WT), 2.888 Å (Mut1), and 2.448 Å (Mut2) and more diverse conformations of Mut1 and Mut2. These results indicated that IgFold was capable of learning long-range correlations in protein sequences, but that these long-range correlations could introduce larger errors when MSA and templates were available. These results thus demonstrated that MSA could strongly influence the accuracy for which similar CDR-H3 template structures were available.
H3-OPT can predict antibody surface properties
To demonstrate how improved CDR-H3 structural accuracy can assist antibody engineering, we applied H3-OPT to predict surface properties. Comparison of H3-OPT with AF2 for identifying surface amino acids (SAAs) by the relative accessible surface areas (rASAs) of CDR-H3 loops showed that H3-OPT could predict SAAs, on average, close to that of the native structures (9.46 vs 9.40, Fig. 5a), whereas AF2 predicted an average of 9.81 SAAs. Furthermore, H3-OPT predicted closer values to native structures than AF2 in predicting diverse surface properties, such as the distribution of hydrophilic or charged SAAs (Fig. 5b).
To examine insights H3-OPT could provide into the biophysical properties of antibodies32, we next estimated the solvent-accessible surface areas (SASAs). The average SASAs of H3-OPT loops closely resembled that of the native structures, with larger differences in AF2 predictions (Fig. 5c). Plots of SASA distributions at each alignment position revealed smaller errors in H3-OPT than AF2, with ΔSASA ranging from -12.17 to 9.74 Å2 (Extended Data Fig. 1). In addition, comparison of surface charge distributions on the predicted CDR-H3 loops generated by H3-OPT and AF2 in an electrostatic map of a representative structure (PDB 5U3P, Fig. 5d) showed that H3-OPT predictions were consistent with the experimental structure. These results collectively showed that accurate surface properties predicted by H3-OPT could provide insights into antibody folding and stability, as well as their interactions with antigens.
H3-OPT can facilitate investigation of antibody-antigen interactions
To explore the potential applications of H3-OPT in studying antibody-antigen interactions, we analyzed contact patterns at predicted binding sites. Given that predicting the structure of an antibody-antigen complex remains challenging, we superimposed VH fragments from H3-OPT and AF2 onto experimentally determined native complex structures available in our test set. After identifying the contact residues of antigens by H3-OPT, we found that H3-OPT could substantially outperform AF2 (Fig. 6a), with a median precision of 0.82 and accuracy of 0.98 compared to 0.71 precision and 0.97 accuracy of AF2. Next, we estimated the distances among interface residues, which are related to binding affinity at the interface. We found that H3-OPT had less error in distance metrics than predictions by AF2 across different distance thresholds, with average mean squared errors of 2.42 Å and 4.85 Å, respectively (Fig. 6b). Furthermore, calculation of H3 contact propensities to assess potential differences in binding patterns between the predicted and native structures using high-quality H3 conformations indicated that H3-OPT also displayed higher accuracy in predicting contact propensities (Fig. 6c).Finally, to test whether H3-OPT was reliable for investigation of the detailed mechanisms underlying antigen-antibody binding, we generated contact maps and calculated binding affinities for complex structures obtained by H3-OPT, AF2, or through experiments (see Methods for details). We found that contact maps predicted by H3-OPT were consistent with those observed in the experimental structures (Fig. 6d). Since affinity prediction plays a crucial role in antibody therapeutics engineering, we performed MD simulations to compare the differences in binding affinities between AF2-predicted complexes and H3-OPT-predicted complexes. Calculation of binding affinities through MD simulations showed that the average affinities of structures obtained by H3-OPT prediction were closer to those of experimentally determined structures than values obtained through AF2 (Table 2). These cumulative findings illustrate the informative value of high-quality H3 loops for predicting binding affinity and facilitating antibody engineering, and further support the use of H3-OPT as a powerful and versatile tool for investigating antigen-antibody interactions.
Discussion
Several platforms for in silico antibody engineering have been developed to improve binding affinity33–36, humanization37,38 and stability39,40. At present, engineering these biophysical properties of therapeutic antibodies heavily relies on the availability of the structures of thousands of antibodies. The accurate structure of CDR-H3 is crucial for understanding the mechanism(s) of antigen-antibody interactions. Although AF2 performs well when MSA and templates are available, its predictive capability decreases for targets with long CDR-H3 loops. By contrast, PLMs leverage tens of millions of protein sequences to learn relevant structural patterns, consequently enhancing their capacity to predict orphan antibodies41. The results of IgFold indicate that PLMs, which incorporate transformer-based attention mechanisms, are capable of learning both local and global sequence information23. While global information can be useful for challenging targets, it may lead to large errors when templates were available, as demonstrated by our case study of anti-VEGF nanobodies (Fig. 4).
H3-OPT combines the strengths of AF2 and PLMs to predict the structures of mAbs and Nbs, and our results show that combining these features can improve the average RMSDCα by 20% over that of AF2 or IgFold. However, H3-OPT is less efficient than PLM-based methods like IgFold because it relies on AF2 structures. To improve efficiency, one possible solution is to cluster the targets based on sequence identity and compute AF2 structures for each cluster, followed by generation of mutant structures. Despite the lower efficiency, H3-OPT offers several notable advantages over IgFold. First, H3-OPT incorporates a simplified version of Evoformer, which improves training efficiency. In addition, H3-OPT employs an alignment-based strategy to extract structural features that are then used to directly predict the Cα coordinates of H3 loops, effectively capturing the residue features relevant to loop folding. Moreover, we found that IgFold uses an augmented dataset containing AF2-predicted structures, which could potentially propagate errors during model training. In contrast, H3-OPT was trained by using high-quality and non-redundant antibody structures. Finally, H3-OPT also shows lower Cα-RMSDs compared to AF2 or tFold-Ab for the majority (six of seven) of targets in an expanded benchmark dataset, including all antibody structures from CAMEO 2022 (Extended Data Fig.2).
Although H3-OPT can provide high accuracy with a small training dataset, it is likely that a larger dataset containing high-resolution structures will further improve our model. In the current study, we observed that the accuracy of AF2 CDR-H3 predictions was correlated with the length of the H3 loop. Thus, AF2 can provide accurate predictions for short loops with fewer than 10 amino acids, with PLM-based models offering little or no improvement in such cases. Conversely, antibodies with H3 loop lengths exceeding 25 residues (common in Sub3) pose a long-standing challenge for structural prediction algorithms, and none of the existing methods (including H3-OPT) tested here could provide accurate predictions due to a lack of homologous sequences and a high degree of freedom in this subgroup. Notably, H3-OPT outperformed AF2 in Sub3 because the context of protein sequences learned from ESM2 was fine-tuned to fit antibody structures. Furthermore, we envision that antibodies with H3 loop lengths ranging from 10 to 25 amino acids could be further optimized using PLMs with more model parameters. In addition, we attempted to optimize the H3 loop structure through molecular dynamics simulations and quantum mechanics-based methods (Supplementary Section 2). However, these approaches generally yielded less accurate predictions than AF2 due to inaccurately described solvent effects and other environmental factors. The development of a more accurate, physics-based method for modeling long loops may be another potentially effective strategy for improving H3-OPT.
Deep generative models are playing a vital role in the field of protein design, leading to several successful applications42–47. These models capture the intricate relationship between sequences and structures, enabling the generation of novel protein scaffolds not existing in nature. However, the potential immunogenicity concerns associated with these newly designed proteins require comprehensive preclinical and clinical investigations. Consequently, conventional mAbs and Nbs remain the prevailing choices in medicine. Recognizing that the properties of antibodies are heavily influenced by their structures, accurate prediction of the structures of mAbs and Nbs is of great significance in optimizing their therapeutic effectiveness and clinical relevance. To this end, H3-OPT has strong potential to accelerate optimization of antibody-antigen binding as well as engineering of therapeutic antibodies with specialized biophysical properties.
Method
H3-OPT dataset
Crystal structures of antibodies were obtained from SAbDab, a non-redundant nanobody database48 and a subset of DeepAb. The structures with resolution greater than 2.5 Å or redundant sequences with 95% or greater identity were removed (Fig. 1a). We removed the structures with missing loops in CDR-H3 or lengths greater than 45. The structures with missing residues at CDR-H3 loops were also dropped. To enhance the generalizability of H3-OPT, the sequences were clustered based on a 90% similarity cutoff by using UCLUST29. The resulting clusters were then randomly divided into training, validating, and testing sets to avoid overestimation of performance. The number of sequences in training, validating and testing sets was 1021(925 mAbs, 96 Nbs), 134 (122 mAbs, 12Nbs) and 131 (119 mAbs, 12 Nbs), respectively, with average CDR-H3 loop length of 11.8, 11.9 and 11.7, respectively. Schrödinger v. 2017-2 (Schrödinger, New York, NY, USA), Numpy 1.18.5 and Pandas 1.0.5 was used for data preparation.
Structure preparation
To address the transformational invariance for predicting 3D coordinates, we applied a structural alignment strategy. Prior to feature extraction, we removed all atoms of light chains in mAbs Fv structures and any non-standard residues in native structures. All antibody VH native structures were aligned using StructAlign module in Schrödinger with a randomly selected reference structure (PDBID: 1GIG). Next, we superimposed AF2 predicted models to their corresponding native structures. This alignment allowed the model to learn the interactions between residues located in H3 loops while disregarding the rotation and translation of the entire structure. The heavy chain fragments were determined using Chothia definitions49.
Feature generation
As shown in Table 3, structural features were extracted and aggregated into the following inputs to PSPM of H3-OPT: A residue feature vector of size [Nres, 34] was constructed by concatenating “Amino acid type”, “AF2 predicted coordinates”, “AF2 predicted backbone torsion angles”, “Torsion angles mask” and “Predicted residue mask”. The pairwise feature vector of size [Nres, Nres, 60] included the “Pairwise distances” and “Pairwise amino acid type”.
Network architecture
H3-OPT took the predicted H3 loops of AF2 as input and generate the antibody structure with the optimized CDR-H3 loops as output. H3-OPT was composed of a Template module and a PSPM. The Template module contains two submodules, i.e., CBM and TGM. The CBM calculated the average confidence score of H3 residues and retained AF2 structures with confidence score greater than 0.8. The TGM grafted the template loop structure onto the input AF2 model, when their CDR-H3 sequences were identical. We replaced the predicted CDR-H3 loop generated by AF2 with the template loop by aligning corresponding atoms, which included Cα atoms and carbonyl carbon atoms in the first two residues and Cα atoms and backbone nitrogen atoms in the last two residues.
The input residue-level features are extracted from AF2 predicted structures as mentioned above, resulting in a dimension of L × 34, where L is the padding length of antibody heavy chain. Similarly, the pairwise features has a dimension of L × L × 60. The PSPM network began with a row-wise gated multi-head self-attention layer which updated residue-level features from pair representations. After that, the residue-level information updated the pair representations via outer product mean module. Pair representations were updated by two multiplicative update modules. During training, the weights of the attention layer and the triangle multiplicative update modules were updated 4 times per epoch. Then, the final hidden states from ESM2 (esm2_t33_650M_UR50D) were passed to a linear layer and then concatenated with the residue-level features to enable the learning of latent information from vast protein sequences. To predict all Cα coordinates for the CDR-H3 loop, the output vector was passed through three linear layers with a hidden size of 64. In summary, PSPM utilized an attention-based mechanism to learn the contribution of individual residues and incorporate sequence representations from PLMs into final predictions. We used OpenFold50 (based on GitHub from June 2022: commit 3f57b4a041f063406059f42080ede6d495479617) to implement partial networks of AF2.
We employed a fine-tuning strategy to update the sequence representations in ESM2. Initially, we froze all weights of the 33 representation layers in ESM2 and updated only the weights of the attention layers and pair representation update modules. Subsequently, we fixed all weights of the remaining components in H3-OPT and performed fine-tuning exclusively on all hidden layers of ESM2 for the H3 loop prediction task. Finally, we fine-tuned all parameters of H3-OPT to predict the atomic coordinates of all CDR-H3 Cα atoms.
During training, we trained our model on the training set and validated it on the validation set. The dropout rates of H3-OPT were set to 0.25. Mean squared error (MSE) loss was utilized to train the model. We used Adam optimizer with a learning rate of 1LJ×LJ10−4, weight decay of 5LJ×LJ10−4 for training. The model was trained on an NVIDIA V100 super GPU and took one hour per epoch over the entire training set with a batch size of 64. We randomly selected 10% of all structures for model validation. The H3-OPT was implemented using PyTorch 1.12.1 in Python 3.7.2. The hyperparameters used during training process for H3-OPT were presented in Table 4.
Structure generation and refinement
We utilized a structure refinement strategy to generate structures of CDR-H3 loops and rectify structural errors. Initially, we used PSPM predictions to modify the Cα coordinates of predicted CDR-H3 loops generated by AF2. Then, we minimized the remaining atoms of the CDR-H3 loops, while keeping their Cα coordinates fixed, by using the OPLS 2005 force field51. After the initial energy minimization step, we applied a force constant of 10 kcal/mol to these fixed atoms to guarantee complete relaxation of the entire loops.
Validation settings
H3-OPT, AF2, HelixFold-Single, IgFold and ESMFold were compared on the test subset. Given that tFold-Ab is only available on webserver, we compared H3-OPT with tFold-Ab on CAMEO 2022 dataset. We removed the target antibody structures from AF2 database before prediction to evaluate the performance of AF2. This ensured that the predictions depended solely on the information derived from other template structures, thus avoiding biased results. Since all other methods did not require MSA searching, we did not exclude these targets from their database. To predict the entire Fv structures, we included 50 glycine residues as a linker between the heavy chain and the light chain for antibodies52. The Cα-RMSDs of all predicted H3 loops were calculated after superimposing the VH backbone heavy atoms (without CDR-H3) to the reference structures. All structures were generated using publicly available code repositories: AF2 v2.1.1(based on GitHub the November 2021 version at GitHub: commit 91b43223422420d1783ed802c8b3a8382a9309fd), IgFold (based on GitHub from February 2023: commit 6a09298d165ed1deb438c0b6eefcbcb03ed0eca5), HelixFold-Single (based on GitHub from January 2023: commit 5f39b2c2a4ecc00b89ba05b95dc56212bdd5d886) and ESMFold v1 (based on GitHub from October 2022: commit dc823b89c6acb9f67caea53704c2a97524fbd456).
Analysis of SASAs and pairwise residue contacts
We computed the SASAs and rASAs of all CDR-H3 loops by using the ANARCI webserver53 with the default settings. Residue with an rASA greater than 25% was considered surface residues54. Subsequently, we applied AHo alignment software55 to report the averaged SASA per alignment position. The samples and alignment positions with more than 10 gaps were removed to avoid randomness and bias, reducing the final samples to 123. Finally, the difference in SASAs between H3-OPT and AF2 was calculated by subtracting predicted H3 loops SASAs from the ground truth.
To examine the contribution of H3-OPT in discovering antibody-antigen interactions, we identified contact residue pairs that were within this distance as binding sites by setting a threshold of 5 Å. We next calculated the pairwise residue distance matrices for each individual predicted complex and native structure, where each element represented the closest distance between the heavy atoms of two residues. From these pairwise distances, we derived contact propensity matrices that specifically indicated the presence or absence of interactions between residues.
Analysis of surface electrostatic potential
We generated two-dimensional projections of CDR-H3 loop’s surface electrostatic potential using SURFMAP56 v2.0.0 (based on GitHub from February 2023: commit: e0d51a10debc96775468912ccd8de01e239d1900) with default parameters. The 2D surface maps were calculated by subtracting the surface projection of H3-OPT or AF2 predicted H3 loops to their native structures.
Binding affinity calculation
We analyzed the binding affinities of antibody-antigen complexes predicted by AF2 and H3-OPT. Their relative binding affinities were calculated through molecular dynamics simulations. Initially, the missing side-chains and loops in the antibody structures were filled using the Protein Preparation Wizard module within Schrödinger software. Subsequently, the tLEaP module of AMBER57 was employed to construct the simulation system with the ff19SB force field58 and OPC solvent model59. Additionally, the simulation system was solvated with a 0.15 M NaCl solution. Energy minimization was performed through a 5000-step steepest descent algorithm, followed by a 5000-step conjugate gradient algorithm. Afterwards, a 400-ps NVT simulation with a time step of 2 fs was performed to gradually heat the system from 0 K to 298 K (0–100 K: 100 ps; 100-298 K: 200 ps; hold 298 K: 100 ps), and a 100-ps NPT simulation with a time step of 2 fs was performed to equilibrate the density of the system. During heating and density equilibration, we constrained the antigen-antibody structure with a restraint value of 10 kcal·mol-1·Å-2. In the production run, 100-ns MD simulations were performed with a time step of 2 fs. The first 50 ns restrains the non-hydrogen atoms of the antigen-antibody complex, and the last 50 ns restrains the non-hydrogen atoms of the antigen, with a constraint value of 10 kcal·mol-1·Å-2. The distance cutoff for nonbonded interactions was set to 10 Å, and the Berendsen algorithm was utilized to maintain isotropic pressure coupling at 1 bar. The Langevin algorithm was employed to maintain the simulation temperature at 298 K. The relative binding affinities of the antigen-antibody complexes were evaluated using the MMPBSA module of AMBER software, which computed the MM/GBSA energies for the trajectory frames of last 10 ns.
Crystallization and data collection
The protein expression, purification and crystallization experiments were described previously27,60. The proteins used in the crystallization experiments were unlabeled. Upon thawing the frozen protein on ice, we performed a centrifugation step to eliminate any potential crystal nucleus and precipitants. Subsequently, we mixed the protein at a 1:1 ratio with commercial crystal condition kits using the sitting-drop vapor diffusion method facilitated by the Protein Crystallization Screening System (TTP LabTech, mosquito). After several days of optimization, single crystals were successfully cultivated at 21°C and promptly flash-frozen in liquid nitrogen. The diffraction data from various crystals were collected at the Shanghai Synchrotron Research Facility and subsequently processed using the aquarium pipeline.61
Statistics analysis
We conducted two-sided t-test analyses to assess the statistical significance of differences between the various groups. Statistical significance was considered when the p-values were less than 0.05. These statistical analyses were carried out using Python 3.10 with the Scipy library (version 1.10.1).
Data and code availability
The datasets of our study and the codes of H3-OPT are freely available at https://github.com/chdcg/H3-OPT.
Acknowledgements
This work was supported by the Tsinghua University Initiative Scientific Research Program (No.20231080030), Vanke Special Fund for Public Health and Health Discipline Development (2022Z82WKJ009), the Tsinghua-Peking University Center for Life Sciences (No.20111770319), Tsinghua University - Peking Union Medical College and Hospital Collaboration Foundation (No. 20191080837).
Competing interests
The authors declare no competing interests.
References
- 1Developing therapeutic approaches for twenty-first-century emerging infectious viral diseasesNature Medicine 27:401–410https://doi.org/10.1038/s41591-021-01282-0
- 2Antibodies to watch in 2021MAbs 13https://doi.org/10.1080/19420862.2020.1860476
- 3Safety and activity of anti-PD-L1 antibody in patients with advanced cancerThe New England Journal of Medicine 366:2455–2465https://doi.org/10.1056/NEJMoa1200694
- 4Fragment-based computational design of antibodies targeting structured epitopesScience Advances 8https://doi.org/10.1126/sciadv.abp9540
- 5Structural Modeling of Nanobodies: A Benchmark of State-of-the-Art Artificial Intelligence ProgramsMolecules 28
- 6RosettaAntibodyDesign (RAbD): A general framework for computational antibody designPlos Comput Biol 14https://doi.org/10.1371/journal.pcbi.1006112
- 7PIGS: automatic prediction of antibody structuresBioinformatics 24:1953–1954https://doi.org/10.1093/bioinformatics/btn341
- 8ABodyBuilder: Automated antibody structure prediction with data–driven accuracy estimationmAbs 8:1259–1268https://doi.org/10.1080/19420862.2016.1205773
- 9Comparative Protein Structure Modeling Using ModellerCurrent Protocols in Bioinformatics 15:5–5https://doi.org/10.1002/0471250953.bi0506s15
- 10Second antibody modeling assessment (AMA-II)Proteins: Structure, Function, and Bioinformatics 82:1553–1562https://doi.org/10.1002/prot.24567
- 11Antibody modeling assessment II. Structures and modelsProteins: Structure, Function, and Bioinformatics 82:1563–1582https://doi.org/10.1002/prot.24554
- 12Antibody modeling assessmentProteins: Structure, Function, and Bioinformatics 79:3050–3066https://doi.org/10.1002/prot.23130
- 13Protein structure predictions to atomic accuracy with AlphaFoldNature Methods 19:11–12https://doi.org/10.1038/s41592-021-01362-6
- 14The trRosetta server for fast and accurate protein structure predictionNat Protoc 16:5634–5651https://doi.org/10.1038/s41596-021-00628-9
- 15Accurate prediction of protein structures and interactions using a three-track neural networkScience 373:871–876https://doi.org/10.1126/science.abj8754
- 16Helixfold-single: Msa-free protein structure prediction by using protein language model as an alternativearXiv preprint arXiv:2207.13921 https://doi.org/10.48550/arXiv.2207.13921
- 17High-resolution de novo structure prediction from primary sequencebioRxiv https://doi.org/10.1101/2022.07.21.500999
- 18Evolutionary-scale prediction of atomic-level protein structure with a language modelScience 379:1123–1130https://doi.org/10.1126/science.ade2574
- 19QuoteTarget: A sequence-based transformer protein language model to identify potentially druggable protein targetsProtein Sci 32https://doi.org/10.1002/pro.4555
- 20Efficient evolution of human antibodies from general protein language modelsNature Biotechnology https://doi.org/10.1038/s41587-023-01763-2
- 21ProtGPT2 is a deep unsupervised language model for protein designNature Communications 13https://doi.org/10.1038/s41467-022-32007-7
- 22Large language models generate functional protein sequences across diverse familiesNature Biotechnology https://doi.org/10.1038/s41587-022-01618-2
- 23Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodiesNature Communications 14https://doi.org/10.1038/s41467-023-38063-x
- 24tFold-Ab: Fast and Accurate Antibody Structure Prediction without Sequence HomologsbioRxiv https://doi.org/10.1101/2022.11.10.515918
- 25Antibody structure prediction using interpretable deep learningPatterns 3https://doi.org/10.1016/j.patter.2021.100406
- 26NanoNet: Rapid and accurate end-to-end nanobody modeling by deep learningFrontiers in Immunology 13https://doi.org/10.3389/fimmu.2022.958584
- 27Polymorphic nanobody crystals as long-acting intravitreal therapy for wet age-related macular degenerationBioengineering & Translational Medicine https://doi.org/10.1002/btm2.10523
- 28SAbDab: the structural antibody databaseNucleic Acids Res 42:D1140–1146https://doi.org/10.1093/nar/gkt1043
- 29Search and clustering orders of magnitude faster than BLASTBioinformatics 26:2460–2461https://doi.org/10.1093/bioinformatics/btq461
- 30Highly accurate protein structure prediction with AlphaFoldNature 596:583–589https://doi.org/10.1038/s41586-021-03819-2
- 31Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequencesPNAS 118https://doi.org/10.1073/pnas.2016239118
- 32Anchor-Locker Binding Mechanism of the Coronavirus Spike Protein to Human ACE2: Insights from Computational AnalysisJournal of Chemical Information and Modeling 61:3529–3542https://doi.org/10.1021/acs.jcim.1c00241
- 33Computational design of antibody-affinity improvement beyond in vivo maturationNature Biotechnology 25:1171–1176https://doi.org/10.1038/nbt1336
- 34Application of an integrated computational antibody engineering platform to design SARS-CoV-2 neutralizersAntibody Therapeutics 4:109–122https://doi.org/10.1093/abt/tbab011
- 35Affinity enhancement of an in vivo matured therapeutic antibody using structure-based computational designProtein Sci 15:949–960https://doi.org/10.1110/ps.052030506
- 36Deep learning guided optimization of human antibody against SARS-CoV-2 variants with broad neutralizationPNAS 119https://doi.org/10.1073/pnas.2122954119
- 37Antibody humanization by structure-based computational protein designmAbs 7:1045–1057https://doi.org/10.1080/19420862.2015.1076600
- 38Structure guided homology model based design and engineering of mouse antibodies for humanizationBioinformation 10:180–186https://doi.org/10.6026/97320630010180
- 39In Silico Prediction of Diffusion Interaction Parameter (k(D)), a Key Indicator of Antibody Solution BehaviorsPharmaceutical Research 35https://doi.org/10.1007/s11095-018-2466-6
- 40Computational stabilization of T cell receptors allows pairing with antibodies to form bispecificsNature Communications 11https://doi.org/10.1038/s41467-020-16231-7
- 41Single-sequence protein structure prediction using a language model and deep learningNature Biotechnology 40:1617–1623https://doi.org/10.1038/s41587-022-01432-w
- 42De novo design of protein structure and function with RFdiffusionNature https://doi.org/10.1038/s41586-023-06415-8
- 43Robust deep learning-based protein sequence design using ProteinMPNNScience 378:49–56https://doi.org/10.1126/science.add2187
- 44Protein design and variant prediction using autoregressive generative modelsNature Communications 12https://doi.org/10.1038/s41467-021-22732-w
- 45Antigen-specific antibody design and optimization with diffusion-based generative models for protein structuresAdvances in Neural Information Processing Systems 35:9754–9767
- 46Broadly applicable and accurate protein design by integrating structure prediction networks and diffusion generative modelsbioRxiv https://doi.org/10.1101/2022.12.09.519842
- 47Generative models for graph-based protein designAdvances in neural information processing systems 32
- 48A non-redundant data set of nanobody-antigen crystal structuresData Brief 24https://doi.org/10.1016/j.dib.2019.103754
- 49Canonical structures for the hypervariable regions of immunoglobulinsJournal of Molecular Biology 196:901–917https://doi.org/10.1016/0022-2836(87)90412-8
- 50OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalizationbioRxiv https://doi.org/10.1101/2022.11.20.517210
- 51Integrated Modeling Program, Applied Chemical Theory (IMPACT)Journal of Computational Chemistry 26:1752–1780https://doi.org/10.1002/jcc.20292
- 52ColabFold: making protein folding accessible to allNature Methods 19:679–682https://doi.org/10.1038/s41592-022-01488-1
- 53ANARCI: antigen receptor numbering and receptor classificationBioinformatics 32:298–300https://doi.org/10.1093/bioinformatics/btv552
- 54A simple definition of structural regions in proteins and its use in analyzing interface evolutionJournal of Molecular Biology 403:660–670https://doi.org/10.1016/j.jmb.2010.09.028
- 55Yet another numbering scheme for immunoglobulin variable domains: an automatic modeling and analysis toolJournal of Molecular Biology 309:657–670https://doi.org/10.1006/jmbi.2001.4662
- 56SURFMAP: A Software for Mapping in Two Dimensions Protein Surface FeaturesJournal of Chemical Information and Modeling 62:1595–1601https://doi.org/10.1021/acs.jcim.1c01269
- 57An overview of the Amber biomolecular simulation packageWiley Interdisciplinary Reviews: Computational Molecular Science 3:198–210https://doi.org/10.1002/wcms.1121
- 58ff19SB: Amino-Acid-Specific Protein Backbone Parameters Trained against Quantum Mechanics Energy Surfaces in SolutionJournal of Chemical Theory and Computation 16:528–552https://doi.org/10.1021/acs.jctc.9b00591
- 59Building Water Models: A Different ApproachJournal of Physical Chemistry Letters 5:3863–3871https://doi.org/10.1021/jz501780a
- 60Protein crystallization: from purified protein to diffraction-quality crystalNat Methods 5:147–153https://doi.org/10.1038/nmeth.f.203
- 61Aquarium: an automatic data-processing and experiment information management system for biological macromolecular crystallography beamlinesJournal of Applied Crystallography 52:472–477https://doi.org/10.1107/S1600576719001183
Article and author information
Author information
Version history
- Sent for peer review:
- Preprint posted:
- Reviewed Preprint version 1:
- Reviewed Preprint version 2:
- Reviewed Preprint version 3:
- Version of Record published:
Copyright
© 2023, Chen et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 1,702
- downloads
- 230
- citation
- 1
Views, downloads and citations are aggregated across all versions of this paper published by eLife.