RMSDs of AlphaFold-multimer structures from experimental unbound and bound structures.

Distribution of the RMSD between the AlphaFold-multimer prediction top-ranked model and the experimental unbound and bound structures. For each target, the protein partners are split into receptor and ligand respectively for comparison. Each symbol represents a category of flexibility (rigid, medium, and flexible). (A) Dockground Benchmark set 5.5; (B) Antibody/nanobody-antigen targets from the benchmark.

Comparison of AFm pLDDT with structural metrics.

(A) AlphaFold pLDDT plotted against LDDTBU (local distance difference test). LDDTBU is calculated by comparing the unbound and bound environment for each residue. High scores correlate with high pLDDT (red). (B) Per-residue root-mean-square-deviation between unbound-bound structures (Per-Residue RMSDBU) v/s AlphaFold pLDDT. Higher RMSDs correlate with lower pLDDT. (C) Structures for two targets (PDB ID: 1B6C and 1FQ1) with the experimental bound form (in gray) and the AlphaFold-multimer predicted model (red-white-blue in A and B). In both cases, the residues with low pLDDT scores (red) are the residues with incorrect conformation and more conformational change.

AlphaFold multimer predictions with reference to bound experimentally-characterized structures.

Four targets with poor DockQ scores and high interface RMSDs: (i) activated Rac1 bound to phospholipase Cβ2 (2FJU) - rigid target (RMSDUB= 1.04 Å), (ii) nanobody bound to serum albumin (5VNW) - medium target (RMSDUB= 1.49 Å), (iii) 14-3-3 zeta Isoform:serotonin N-acetyltransferase complex (1IB1) - difficult target (RMSDUB= 2.09 Å), and (iv) G6 antibody in complex with the VEGF antigen - difficult target (RMSDUB= 2.51 Å). Bound structure in green and AlphaFold prediction colored by residue-wise pLDDT in red → blue. (low confidence high confidence).

Interface-pLDDT is the best indicator of model docking quality.

(A) Receiver-operator characteristics (ROC) curve as a function of different metrics for the docking dataset (n=254). Interface residues are defined based on whether atoms of residues on one partner are within 8 Å from atom/s on another partner. Interface-pLDDT is the average pLDDT of interface residues. Avg-pLDDT corresponds to the average pLDDT across all the residues in the predicted model. Interface contacts and interface residues are the counts of the interface contacts and interface residues respectively. Interface-pLDDT has the highest AUC score of 0.86. (B) Confusion matrix with an interface-pLDDT threshold between labels predicted false (<85) and true (85) and an interface-RMSD threshold between labels actually true (4 Å) and false(>4 Å) actual labels. (C) Interface-pLDDT versus DockQ for all protein targets in the benchmark set. DockQ is calculated from the predicted AlphaFold structure and the experimental bound structure in the PDB. We fit a sigmoidal curve to this available data.

AlphaRED protein docking pipeline.

Starting with protein sequences of putative complexes, we obtain predicted models from AlphaFold. Each model is accompanied with pLDDT scores, and based on the interface pLDDT we either initiate global rigid-body docking (interface pLDDT < 85), or flexible local docking refinement(interface pLDDT 85). For global rigid-body docking, the protein partners are first randomized in Cartesian coordinates and then docked with rigid-backbones using temperature replica exchange docking within ReplicaDock2. 2 Decoy structures are clustered based on energy before flexible local docking refinement. In flexible local docking, we use the directed induced-fit strategy in ReplicaDock2. With mobile residues selected by the AlphaFold residue-wise pLDDT scores (threshold of 80). The protocol moves the backbones with Rosetta’s Backrub or Balanced Kinematic Closure movers. Output structures are refined and top-scoring structures are selected based on interface energy.

Docking performance.

Targets with Interface-pLDDT 85 passed first to global rigid docking (red) where targets with interface-pLDDT> 85 proceeded directly to local flexible backbone docking refinement (colored based on their interface-pLDDT scores (in shades of blue) (A) Interface-RMSD from AlphaFold-multimer predicted models (y-axis) in comparison with AlphaRED models (x-axis). (B) Fraction of native-like contacts for models from AFm and AlphaRED respectively. (a) and (b) indicate two targets, (global and local docking) highlighted in Fig. 7. (C) Performance on the subset of antigen-antibody targets in DB5.5.

Global and local docking performance

Docking performance for targets (a) activated Rac1 bound to phospholipase Cβ2 (2FJU), and (b) neutralizing anti-human antibody Fab fragment in complex with human GM-CSF (5C7X). Starting from the AFm model (orange), global docking performance on 2FJU shows native-like binding site (gray) and sampled AlphaRED decoy (blue). For local docking, backbone sampling on mobile residues predicted by residue pLDDT (outlined cartoon) shows AlphaRED decoy (blue) moves backbone towards the bound form(gray).

AFm and AlphaRED performance on CASP15 targets

Docking performance for CASP targets T205-T209. (top) T205. Interface score (Rosetta Energy Units, REU) vs Interface RMSD (Å) for candidate docking structures generated by the AlphaRED docking pipeline. (top-right) The top-scoring AlphaRED model (green-blue) recapitulates the native interface (gray) and has an interface RMSD of 2.84 Å. The distinction between the predicted model with respect to the AFm model (orange) is evident (bottom) Top-scoring AlphaRED predictions for targets T206, T207, T208, and T209 respectively.

Docking prediction success with AFm and AlphaRED.

Comparison of AFm (hashed) and AlphaRED performance for DB5.5 benchmark set. (A) Classification based on the scale of flexibility: difficult (35 targets); medium (60 targets); rigid (159 targets). (B) Performance on the antibody-antigen complexes (67 targets) and other (non-antibody targets).