Abstract
The structural basis for the pharmacology of G protein-coupled receptors (GPCR), the most abundant membrane proteins and the target of about 35% of approved drugs, is still a matter of intense study. What makes GPCRs challenging to study is the inherent flexibility and the metastable nature of interaction with extra- and intracellular partners that drive their effects. Here, we present a molecular dynamics (MD) adaptive sampling algorithm, namely multiple walker supervised molecular dynamics (mwSuMD), to address complex structural transitions involving GPCRs without energy input. By increasing the complexity of the simulated process, we first report the binding and unbinding of the vasopressin peptide from its receptor V2. Successively, we show the stimulatory (Gs) and inhibitory (Gi) G proteins binding to the adrenoreceptor β2 (β2 AR), and the adenosine 1 receptor (A1R), respectively. Then we present the complete transition of the glucagon-like peptide-1 receptor (GLP-1R) from inactive to active, agonist and Gs-bound state, and the GDP release from the activated Gs. Finally, we report the heterodimerization between the adenosine receptor A2 (A2AR) and the dopamine receptor D2 (D2R) and subsequent bivalent ligand binding. We demonstrate that mwSuMD can address, without or with limited energetic bias, complex binding processes such as G protein selectivity and homo- and heterodimerization that are intrinsically linked to the dynamics of the protein and out of reach of classic MD.
Introduction
Supervised molecular dynamics1, 2 (SuMD) is an efficient technique for studying ligand-receptor binding and unbinding pathways; here, we present the multiple walker enhancement (mwSuMD) to study a significantly wider range of structural transitions relevant to the drug design. We validated the method by applying it to G protein-coupled receptors (GPCRs), since their inherent flexibility is essential to their function and because these are the most abundant family of membrane receptors in eukaryotes3 and the target for more than one-third of drugs approved for human use4.
Vertebrate GPCRs are subdivided into five subfamilies (Rhodopsin or class A, Secretin or class B, Glutamate or class C, Adhesion, and Frizzled/Taste2) according to function and sequence5, 6. X-ray and cryo-electron microscopy (cryo-EM) show that GPCRs possess seven transmembrane (TM) helices connected by three extracellular loops (ECLs) and three intracellular loops (ICLs), with an extended and structured N-terminal extracellular domain (ECD) in all subtypes, but class A. The primary function of GPCRs is transducing extracellular chemical signals into the cytosol by binding and activating four G protein families (Gs/olf, Gi/o, G12/13, and Gq/11) responsible for decreasing (Gi/o) or increasing (Gs/olf) cyclic adenosine-3’,5’- monophosphate (cAMP), and generating inositol-1,4,5-triphosphate (IP3) and diacylglycerol (DAG) to increase Ca2+ intracellular levels (Gq)7.
Many aspects of GPCR pharmacology remain elusive: for example, the structural determinants of the selectivity displayed towards specific G proteins or the ability of certain agonists to drive a preferred intracellular signaling pathway over the others (i.e. functional selectivity or bias)8. GPCRs are challenging proteins to characterize experimentally due to their inherent flexibility and the transitory nature of the complexes formed with extracellular and intracellular effectors. Importantly, agonists can allosterically modify the receptor selectivity profile by imprinting unique intracellular conformations from the orthosteric binding site. The mechanism behind these phenomena is one of the outstanding questions in the GPCR field9.
Molecular dynamics (MD) is a powerful computational methodology that predicts the movement and interactions of (bio)molecules in systems of variable complexity, at atomic detail. However, classic MD sampling is limited to the microsecond or, in the best conditions, the millisecond time scale10, 11. For this reason, different algorithms have been designed to speed up the simulation of rare events such as ligand (un)binding and conformational transitions. Amongst the most popular and effective ones, are metadynamics12, accelerated MD (aMD)13, and Gaussian-accelerated MD (GaMD)14, which introduce an energy potential to overcome the energy barriers preventing the complete exploration of the free energy surface, thus de facto bias the simulation. Such methods have been used to study GPCR activation15, 16. Energetically unbiased MD protocols, on the other hand, comprise weighted ensemble MD (weMD)17 and SuMD1, 18, which have largely been applied to (unbinding) small molecules, peptides, and small proteins1, 18–22. Since SuMD is optimized for (un)bindings, we have designed mwSuMD to address more complex conformational transitions and protein-protein associations. GPCRs preferentially couple to very few G proteins out of 23 possible counterparts9, 23. It is increasingly accepted that dynamic and transient interactions determine whether the encounter between a GPCR and a G protein results in productive or unproductive coupling24. MD simulations are considered a useful orthogonal tool for providing working hypotheses and rationalizing existing data on G protein selectivity. However, so far, it has not delivered as expected. Attempts so far have employed energetically biased simulations or have been confined to the Gα subunit15, 16, 23.
We tested mwSuMD on a series of increasingly difficult structural transitions involving both class A and class B1 GPCRs. Firstly, we validated the method on the nonapeptide arginine vasopressin (AVP) by simulating binding (dynamic docking) and unbinding paths from the vasopressin 2 receptor (V2R). Dynamic docking, although more computationally demanding than standard molecular docking, provides insights into the binding mode of ligands in a fully hydrated and flexible environment. Moreover, it informs about binding paths and the complete mechanism of formation leading to an intermolecular complex, delivering in the context of binding kinetics25 and structure-kinetics relationship (SKR) studies26. We then show the Gs and Gi G proteins binding to the adrenoreceptor β2 (β2 AR) and the adenosine 1 receptor (A1R), respectively, starting from different conditions. GPCRs preferentially couple to very few G proteins out of 23 possible counterparts9, 23.
A further case study is the activation of the class B1 GPCR glucagon-like peptide-1 receptor (GLP-1R) by the small molecule PF06882961. GLP-1R is a validated target in type 2 diabetes and probably the best-characterized class B1 GPCR from a structural perspective. GLP-1R is the only class B1 receptor with structurally characterized non-peptidic orthosteric agonists, which makes it a model system for studying the druggability of the entire B1 subfamily. After GLP-1R agonist binding and activation, the coupling of Gs and the release of GDP, which is the rate-limiting step of the G protein activation, was simulated for the first time using an energy-unbiased method.
The last GPCR key process addressed by mwSuMD is the heterodimerization in the membrane between the adenosine receptor A2 (A2AR) and the dopamine receptor D2 (D2R). The A2AR:D2R heterodimer27 is a therapeutic target for neurodegenerative diseases, Parkinson’s disease, and schizophrenia28–30 due to the reciprocal antagonistic allosteric effect between monomers31. A2AR activation reduces the binding affinity of D2R agonists, while A2AR antagonists enhance the dopaminergic tone by decreasing the adenosine negative allosteric modulation on D2R. Heterobivalent ligands able to inhibit A2AR and activate D2R represent a valuable pharmacological tool32 and, in principle, therapeutic options for conditions characterized by a reduction of dopaminergic signaling in the central nervous system. The successive dynamic docking of the heterobivalent ligand compound 2633 to the heterodimer suggested by mwSuMD produced a ternary complex stabilized by lipids and is indicative of the usefulness of mwSuMD for illuminating the molecular events involved in GPCR function.
Results and Discussion
Short mwSuMD time windows improve the AVP dynamic docking prediction
Arginine vasopressin (AVP) is an endogenous hormone (Figure S1a) that mediates antidiuretic effects on the kidney by signaling through three class A GPCR subtypes: V1a and V1b receptors activate phospholipases via Gq/11, while the V2 receptor (V2R) activates adenylyl cyclase by interacting with Gs 34 and is a therapeutic target for hyponatremia, hypertension, and incontinence35. AVP is amphipathic and in the bound state interacts with both polar and hydrophobic V2R residues located on TM helices and ECLs (Figure S1b). Although AVP presents an intramolecular C1-C6 disulfide bond that limits the overall conformational flexibility of the backbone, it has a high number of rotatable bonds, making dynamic docking complicated36. We assessed the performance of mwSuMD and the original version of SuMD in reconstructing the experimental V2R:AVP complex using different settings, simulating a total of 92 binding events (Table S1). As a reference, the AVP RMSD during a classic (unsupervised) equilibrium MD simulation of the X-ray AVP:V2R complex was 3.80 ± 0.52 Å (Figure S2). SuMD1, 18 produced a minimum root mean square deviation (RMSD) to the cryo-EM complex of 4.28 Å, with most of the replicas (i.e., distribution’s mode) having an RMSD close to 10 Å (Figure S3a). mwSuMD, with the same settings (Figure S3b, Table S1) in terms of time window duration (600 ps), metric supervised (the distance between AVP and V2R), and acceptance method (slope) produced slightly more precise results (i.e., distribution’s mode RMSD = 7.90 Å) but similar accuracy (minimum RMSD = 4.60). Supervising the AVP RMSD to the experimental complex rather than the distance (Figure S3c) and using the SMscore (Equation 1) as the acceptance method (Figure S3d) worsened the prediction. Supervising distance and RMSD at the same time (Figure S3e), employing the DMscore (Equation 2), recovered accuracy (minimum RMSD = 4.60 Å) but not precision (distribution mode RMSD = 12.40 Å). Interestingly, decreasing the time window duration from 600 ps to 100 ps impaired the SuMD ability to predict the experimental complex (Figure 1a), but enhanced mwSuMD accuracy and precision (Figure 1b-d). The combination of RMSD as the supervised metric and SMscore produced the best results in terms of minimum RMSD and distribution mode RMSD, 3.85 Å and 4.40 Å, respectively (Figure 1d, Video S1), in agreement with the AVP deviations in the equilibrium MD simulation of the X-ray AVP:V2R complex (Figure S2). These results confirm the inherent complexity of reproducing the AVP:V2R complex via dynamic docking and suggest that short time windows can improve mwSuMD performance on this system. However, it is necessary to know the final bound state to employ the RMSD as the supervised metric, while the distance is required to dynamically dock ligands with unknown bound conformation as previously reported1, 23. Both distance and RMSD-based simulations delivered insights into the binding path and the residues involved along the recognition route. For example, mwSuMD pinpointed V2R residues E184ECL2, P298ECL3, and E303ECL3 (Figure S4a) as involved during AVP binding, although not in contact with the ligand in the orthosteric complex. None of them are yet characterized through mutagenesis studies according to the GPCRdb37.
Further to binding, a SuMD approach was previously employed to reconstruct the unbinding path of ligands from several GPCRs2, 38. We assessed mwSuMD’s capability to simulate AVP unbinding from V2R. Five mwSuMD and five SuMD replicas were collected using 100 ps time windows (Table S1). Overall, mwSuMD outperformed SuMD in terms of time required to complete a dissociation (Figure S5, Video S2), producing dissociation paths almost 10-fold faster than SuMD. Such rapidity in dissociating inherently produces a limited sampling of metastable states along the pathway, which can be compensated by seeding classic (unsupervised) MD simulations from configurations extracted from the unbinding pathway39, 40. Here, the set of V2R residues involved during the dissociation was comparable to the binding (Figure S4b), though ECL2 and ECL3 were slightly more involved during the association than the dissociation, in analogy with other class A and B GPCRs20, 39.
Binding simulations between G proteins and a class A GPCR
We tested the ability of mwSuMD to simulate the binding between a prototypical class A receptor, the β2 adrenoreceptor (β2 AR), and the stimulatory G protein (Gs), without energy input. mwSuMD simulations started from the intermediate, agonist-bound conformation of the β2-AR and the inactive Gs to resemble pre-coupling conditions. Three mwSuMD replicas were performed by supervising the distance between Gs helix 5 (α5) and β2 AR as well as the RMSD of the intracellular end of TM6 to the fully active state of the receptor (Table S1). To monitor the progression of the simulations, we computed the RMSD of the Cα atoms of the Gα and Gβ subunits to the experimental complex41 (Video S3, Figure 2ab). During two out of three replicas, both Gα and Gβ achieved distance values close to 5 Å (minimum RMSD = 3.94 Å and 3.96 Å respectively), in good agreement with the reference (the β2 AR:Gs complex, PDB 3SN6, Figure 2c). The flexibility of Gsβ is backed by both MD and cryo-EM data suggesting G protein rocking motions around Gsα:receptor interactions20, 42.
Usually, ICL3 of the GPCR and the Gs protein loop hgh4, which is not found in other Gα43, are masked out from deposited cryo-EM structures due to their high flexibility and therefore low resolution. During our simulations, these two loops formed polar intermolecular interactions through R239ICL3, R260ICL3, K235ICL3, and E322hgh4.12, D323hgh4.13. Further transient interactions, not visible in the experiential structures, involved a mix of conserved and unique residues forming hydrogen bonds (Table S2): R63ICL1-E392 α5.24, K2325.71-D378Α5.10, K2355.74-D378 α5.10, K235ICL3-D343H4.13, K2676.29-L394α5, R239ICL3-E314hgh4.04, and S1373.56-D381 α5.13. None of the interactions reported in Table S2 is evident from the experimental β2 AR:Gs complex, implying that mwSuMD can deliver useful working hypotheses for mutagenesis and spectroscopic experiments from out-of-equilibrium simulations. Results also suggest that the Gs binding is driven by a combination of conserved and unique transitory interactions with β2 AR, possibly contributing to G protein selectivity. We speculate that the conserved interactions would be necessary for the binding, regardless of the receptor:G protein couple involved, while the transitory interactions due to the dynamics of G protein binding should produce an effective engagement of the G protein.
A possible pitfall of the above-reported Gs:β2 AR mwSuMD binding simulation is that G proteins bear potential palmitoylation and myristoylation sites that anchor the inactive trimer to the plasma membrane44, 45, de facto restraining possible binding paths to the receptor. Recently, the Gi binding to A1R was simulated by combining the biased methods GaMD with SuMD46 but without taking into account the role played by membrane-anchoring post-translational modifications on the Gi binding pathway. To address this point and test the possible system dependency of mwSuMD, we prepared a different class A GPCR, the adenosine A1 receptor (A1R), and its principal effector, the inhibitory G protein (Gi) considering the Giα residue C3 and Gγ residue C65 as palmitoylated and geranylgeranylated respectively and hence inserted in the membrane. Both classic (unsupervised) and mwSuMD simulations were performed on this system for comparison (Video S4, Figure 2d). In about 50 ns of mwSuMD, the Giα subunit engaged its intracellular binding site on A1R and formed a complex in good agreement with the cryo-EM structure (PDB 6D9H, RMSD ≍ 5 Å). For comparison, 1 μs of cMD did not produce a productive engagement as the Giα remained at RMSD values > 40 Å (Figure 2d), suggesting the effectiveness of mwSuMD in sampling G protein binding rare events without the input of energy. The membrane anchoring affected the overall Gi binding and the final complex, which was rotated compared to the experimental structure due to the lipidation of Giα and Gγ (Figure 2e). This suggests that future, more comprehensive studies of G protein binding and activation should consider several G protein orientations around the receptor as the starting points for mwSuMD simulations, to evaluate as many binding paths as possible
PF06882961 binding, GLP-1R activation, and Gs protein activation
The GLP-1R has been captured by cryo-EM in both the inactive apo (ligand-free) and the active (Gs-bound) conformations and in complex with either peptide or non-peptide agonists47–52. In the inactive apo GLP-1R, residues forming the binding site for the non-peptide agonist PF06882961 are dislocated and scattered due to the structural reorganization of the transmembrane domain (TMD) and extracellular domain (ECD) (Figure S6) that occurs on activation. Moreover, GLP-1R in complex with GLP-1 or different agonists present distinct structural features, even amongst structurally related ligands (Figure S7). This complicates the scenario and suggests divergent recognition mechanisms amongst different agonists. We simulated the binding of PF06882961 using multistep supervision on different metrics of the system (Figure 3) to model the structural hallmark of GLP-1R activation (Video S5, Video S6).
Several metrics were supervised in a consecutive fashion. Firstly, the distance between PF06882961 and the TMD as well as the RMSD of the ECD to the active state (stage 1); secondly, the RMSD of ECD and ECL1 to the active state (stage 2); thirdly, the RMSD of PF06882961 and ECL3 to the active state (stage 3); lastly, only the RMSD of TM6 (residues I345-F367, Cα atoms) to the active state (stage 4). The combination of these supervisions produced a conformational transition of GLP-1R towards the active state (Figure 3, Video S6). Noteworthy, mwSuMD, like any other CV-based technique, requires some degree of knowledge of the system simulated. The sequence of these supervisions was arbitrary and does not necessarily reflect the right order of the steps involved in GLP-1R activation. This kind of planned multistep approach is feasible when the end-point receptor inactive and active structures are available, and the inherent flexibility of different domains is known. In class B GPCRs, the ECD is the most dynamic sub-structure, followed by the ECL1 and ECL3 which display high plasticity during ligand binding20, 53. For this reason, we first supervised these elements of GLP-1R, leaving the bottleneck of activation, TM6 outward movement, as the last step. However, the protocol employed can be tweaked to study how each conformational transition takes place and influences the receptor domains. Structural elements not directly supervised, such as TM1 or TM7, displayed an RMSD reduction to the active state because they were influenced by the movement of supervised helixes or loops. For example, the supervision of ECL3 (stage 3) and TM6 (stage 4) facilitated the spontaneous rearrangement of the ECD to an active-like conformation after the ECD had previously experienced transient high flexibility during stages 2 and 3 (Figure 3).
It is worth noting that the only inactive, almost full-sequence, GLP-1R structure available, was solved with an antibody bound to the ECD. This crystallization strategy for stabilizing the receptor might have forced the ECD in a closed conformation that engages the EC vestibule of GLP-1R and possibly restrained the whole TMD in an altered conformation that deviates from the active conformation. This might explain why the RMSD of the TMD elements monitored during the simulation rarely reach values lower than 3 or 4 Å. During the supervision of ECL3 and PF06882961 (stage 3), we observed a loosening of the intracellular polar interactions that stabilize GLP-1R TM6 in the inactive state. As a result, the subsequent supervision of TM6 (residues I345-F367, Cα atoms) rapidly produced the outward movement of TM6 towards the active state, in the last step of the mwSuMD simulation (stage 4). Taken together, these results suggest a concerted conformational transition for ECD and ECL1 during the binding of PF06882961 and an allosteric effect between ECL3 and the bottom of TM6. While the intracellular polar interactions were destabilized by the ECL3 transition to an active-like conformation (stages 2 and 3), the outward movement of TM6 (stage 4) did not favor the closure of ECL3 towards PF06882961, which appear to be driven by direct interactions between the ligand and R3105.40 or R3807.35. Interestingly, the mwSuMD simulation during Stage 4 (TM6 supervision) sampled a counterclockwise rotation of the helix (Figure 4a), consistent with the GLP-1R cryo-EM structures in the active, Gs-coupled state50, 54. To the best of our knowledge, this is the first time an MD simulation captures the TM6 rotation upon receptor activation as results reported so far are largely limited to the TM6 opening and kinking55.
Encouraged by results previously obtained on Gi binding to β2 AR and A1R (Figure 3), we extracted the GLP-1R active conformation described above and simulated the Gs binding to its intracellular side. Starting from the inactive, membrane-anchored Gs, we performed three independent mwSuMD replicas by supervising the distance between Gs helix α5 and GLP-1R residues located at the intracellular binding interface. All three mwSuMD replicas showed the Gs approaching GLP-1R, with two out of three reaching an RMSD of the Gsα subunit close to or less than 10 Å, compared to the experimental complex 7LCI (Figure 4b). Replica 2, in particular, well reproduced the cryo-EM GLP-1R:Gs complex. According to the model of G protein activation, the G protein binding has the effect of allosterically stabilizing the orthosteric agonist in complex with the receptor56 and destabilizing the guanosine-diphosphate (GDP) bound to Gα, triggering its release and exchange with guanosine-triphosphate (GTP)57, upon opening of the G protein alpha-helical domain (AHD). In accordance with this model, PF06882961 and GDP were respectively stabilized and destabilized during the simulated Gs association (Figure 4c,d). The analysis of atomic contacts along the binding path of Gs to GLP-1R highlights a few persistent interactions not observed in the equilibrium MD simulations of GLP-1R:Gs cryo-EM complexes56; in particular, we propose the hidden interaction between D3446.33 and R385 α5 to be pivotal for Gs coupling (Table S3).
We further investigated the Gs activation mechanism by extending Replica 2 (Figure 4b) and supervising the opening of the Gs alpha-helical domain (AHD), which is considered a necessary step to allow GDP release from the Ras-like domain58. We first easily obtained the opening of AHD in the extended Replica 2 and successively supervised the GDP unbinding in further three replicas, seeded after the AHD opening. In one of these three mwSuMD simulations, the nucleotide dissociated from Gs (Figure 4e-h). Video S7 shows the full Gs binding, AHD opening, and GDP release. Interestingly, no conformational changes of the β6-α5 loop happened before or during the GDP dissociation, suggesting that its conformational change as captured in the nucleotide-free GLP-1R:Gs complex (Figure S8) occurs after the GDP release as a result of the loss of binding stabilization, rather than being an initiator of the GDP dissociation. Our simulation is consistent with a major role played by α5 tilting and allosteric communication through internal structural elements like the β2-β3 strands, which weaken both GDP phosphate and purine binding sites as previously suggested by other groups59–61.
The heterodimerization between A2A and D2R, and binding simulations of the heterobivalent ligand compound 26
The current structural model of the A2AR:D2R heterodimer is that TM4 and TM5 from both receptors contribute to the primary dimer interface, although the involvement of TM7 is not ruled out62. Following this interaction model, we first dynamically docked A2AR and D2R in an explicit 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine (POPC) membrane model, then simulated the binding of the heterobivalent compound 2633 (CP26) to the preformed A2AR:D2R heterodimer (Video S8). Since membrane proteins are characterized by slow lateral diffusion63, we encouraged the encounter between A2AR and D2R by inputting energy using metadynamics and adiabatic MD during mwSuMD (hybrid mwSuMD/metadynamics/aMD), followed by 1.5 μs of classic MD (cMD) to relax the system and check the stability of the A2AR:D2R interactions.
During the first 200 ns of simulation with energy bias (Figure 5a,c and Figure S9a), A2AR and D2R rapidly moved close to each other and reached a distance of about 30 Å (computed between centroids), before stabilizing at around 40 Å (Figure 5a). The computed molecular mechanics combined with the Poisson–Boltzmann and surface area continuum solvation (MM-PBSA) binding energy suggested two energy minima (Figure 5c) at these distances. The successive cMD simulation did not produce remarkable changes in the distance between receptors (Figure 5b), although the energy fluctuated before reaching about −10 kcal/mol, at the end of the simulation (Figure 5d). The sharp energy minima after 25 and 150 ns were due to the high number of direct contacts between A2AR and D2R (Figure S9), favored by the energy added to the system. When the input of energy bias was stopped (Figure 5b,d) the POPC residues re-equilibrated at the interface between proteins and mediated intracellular polar interactions between R1504.40 D2R, Y1464.36 D2 and R1995.60 A2A, Y1033.51 A2A as well as extracellular polar interactions between the top of TM4D2, TM5D2 and TM5A2A, TM6A2A (Figure 5f), suggesting that the A2AR:D2R heterodimerization relies on lipids to mediate short-range interactions between receptors.
The dynamic docking of the heterobivalent ligand C26 further stabilized the A2AR:D2R dimer (Figure 5e), in line with experimental data33. C26 reached the bound state rapidly, inserting the agonist pharmacophore within the D2R orthosteric site (Figure S10, Video S8), while the pyrazole-triazole-pyrimidine scaffold remained in a metastable complex with A2AR, before completely binding the orthosteric site at the end of the simulation (Figure S11, Video S8). In the final state, the long linker between pharmacophores extended over the top of the interface formed by A2AR and D2R at the level of the receptors’ ECL2 (Figure 5g). A network of polar interactions between POPC, Y179A2A, and Y192D2 contributed to stabilizing this ternary complex. Interestingly, the latter residues were pinpointed as important for A2AR:D2R interactions62. From a binding energy perspective, C26 reached the most stable configurations between 80 and 100 ns (Figure S12), before the pyrazole-triazole-pyrimidine component of the ligand completed the binding to A2AR. This suggests some contribution of the linker to the overall stability of the ternary complex with A2AR and D2R. Two out of four mwSuMD replicas produced A2AR:D2R:C26 ternary complexes with C26 engaged both by the orthosteric site of A2AR and D2R, while in the remaining two replicas the A2AR pharmacophore remained stacked on the extracellular vestibule of the receptor, although in the proximity of the binding site (Figure S10). Considering these results, we suggest that the medicinal chemistry optimization of bivalent ligands should involve the tuning of the on-rate kinetic constant kon measured on each monomer of the heterodimer, to differentiate which pharmacophoric scaffold is kinetically more affected by the chemical linking and therefore aiding the surgical improvement of the bivalent ligand’s affinity.
Discussion
Classic MD simulations sample phase space with an efficiency that depends on the energy barrier between neighboring minima. Processes like (un)binding and protein activation require the system to overcome numerous energy barriers, some of which create bottlenecks that slows the transition down to the millisecond, or second, time scale. To overcome some of these limits, we have developed and tested on complex structural events characterizing GPCRs an energetically unbiased adaptive sampling algorithm, namely multiple walker mwSuMD, which is based on traditional SuMD, while drawing on parallel multiple replica methods64, 65,
mwSuMD performed similarly to SuMD for the dynamic docking of AVP to V2R when time windows of 600 ps were employed. Time windows of 100 ps remarkably improved mwSuMD. Usually, dynamic docking is performed to either predict the geometry of complexes or sample the binding path of an already-known intermolecular complex, or both. The RMSD of AVP to the experimental coordinates as the supervised metric produced the best results. Consequently, the RMSD should be the metric of choice to study the binding path of well-known intermolecular complexes. The distance, on the other hand, is necessary when limited structural information about the binding mode is available. In the absence of structural information regarding the final bound state, it is possible to sample numerous binding events employing mwSuMD and evaluate the final bound states rank by applying end-point free energy binding methods like the molecular mechanics energies combined with the Poisson– Boltzmann or generalized Born and surface area continuum solvation (MM/PBSA and MM/GBSA66) models. Our simulations propose that remarkable predictivity can be obtained with distance-driven mwSuMD, as demonstrated by the lowest deviation from the experimental AVP:V2R complex. Remarkably, the dissociation of AVP from V2R was simulated much more rapidly by mwSuMD than by SuMD, suggesting it is an efficient tool for studying the dissociation of ligands from GPCRs.
mwSuMD was further tested on the Gs and Gi binding to β2 AR and A1R, respectively. mwSuMD produced G protein:GPCR complexes in remarkable agreement with experimental structural data without the input of energy in a few hundred nanoseconds when starting from inactive Gs and the intermediate active β2 AR, or a few tens of nanoseconds when considering the active-state A1R and Gi anchored to the plasma membrane through the palmitoylation and the geranylgeranylation of Gαγ44, 45, 67.
We increased the complexity of binding simulations by considering GLP-1R and the non-peptide agonist PF06882961. Using mwSuMD, we obtained a binding of PF06882961in good agreement with the cryo-EM structure, followed by an active-like conformational transition of GLP-1R. The choice of the metrics supervised was driven by structural data available50 and extensive preparatory MD simulations, however, binding routes are possible from either the bulk solvent or the membrane39, 68, 69. Future studies on GLP-1R and other class B1 GPCRs should consider different starting points for the ligand and alternative apo receptor conformations to improve the sampling.
mwSuMD was able to simulate the Gs binding to the active GLP-1R and the subsequent GDP release. To our knowledge, this is the first time the whole sequence of events leading from an inactive GPCR to the GDP release is simulated. Remarkably, this was obtained without biasing the simulations with external energy. Our results suggest that the full rotation and elongation of α5 as in cryo-EM structures could occur after the GDP release, supporting the role of hidden, metastable interactions as the driving force of G protein coupling and selectivity, in accordance with recent work on GLP-1R70.
The final case study was the dimerization process between A2AR and D2R in a membrane model. To speed up the encounter between receptors, we introduced an energy bias in the form of abMD and metadynamics. Although mwSuMD is an unbiased adaptive sampling method, it can be easily coupled to many forms of bias to favor the simulation of energy-requiring processes. Our results suggest a fundamental contribution of the phospholipids on the stabilization of the heterodimer, in agreement with experiments71, 72 but confuting protein-protein molecular docking results frequently predicting extended interfaces between monomers73. mwSuMD was able to dynamically dock the heterobivalent ligand CP26, supporting a stabilizing effect on the A2AR:D2R heterodimer. A complete characterization of the possible interfaces between GPCR monomers, which falls beyond the goal of the present work, should be achieved by preparing different initial unbound states characterized by divergent relative orientations between monomers to dynamically dock.
mwSuMD is suited for a mechanistic description of structural events. For this reason, it well integrates with mutagenesis and kinetics experiments informed by the method, and required for the validation of its results in line with any other computational technique. We note that mwSuMD trajectories, since can describe the sequential states along a transition pathway, can represent a precious backbone for further MD sampling aimed at quantifying or predict the kinetics of the event. Approaches such as path collective variable metadynamics74, Markov State Models75, or machine learning models76, 77 can be informed by mwSuMD. Future work will be addressed in this direction.
In summary, we showcased the applicability domain of mwSuMD to key, but scarcely understood, aspects of GPCR structural biology, pharmacology, and drug design hitherto unaddressed by unbiased simulations. Given the generality and simplicity of its implementation, we anticipate that mwSuMD can be employed to study a wide range of structural phenomena characterizing both membrane and cytosolic proteins. mwSuMD is an ongoing project updated on https://github.com/pipitoludovico/mwSuMD_multiEngine and in the recent implementation can exploit ACEMD78, NAMD79, GROMACS80 and OPENMM81 as MD engine.
Methods
Multiple walker SuMD (mwSuMD) protocol
The supervised MD (SuMD) is an adaptive sampling method82 for speeding up the simulation of binding events between small molecules (or peptides83, 84) and proteins1, 18 without the introduction of any energetic bias. Briefly, during SuMD, a series of short unbiased MD simulations are performed, and after each simulation, the distances between the centers of mass (or the geometrical centers) of the ligand and the predicted binding site (collected at regular time intervals) are fitted to a linear function. If the resulting slope is negative (showing progress towards the target) the next simulation step starts from the last set of coordinates and velocities, otherwise, the simulation is restarted by randomly assigning the atomic velocities.
In the implementation for ACEMD used in this work, mwSuMD needs as input the initial coordinates of the system as a pdb file, the coordinates, and the atomic velocities of the system from the equilibration stage, the topology file of the system, and all the necessary force field parameters. The user can decide to supervise one (X) or two metrics (X’, X’’) of the simulated system over short simulations seeded in batches, called walkers. In the former case, either the slope of the linear function interpolating the metric values or a score can be adopted to decide whether to continue the mwSuMD simulation. When the user decides to supervise two metrics, then a specific score is used. In the present work, distances between centroids, RMSDs, or the number of atomic contacts between two selections were supervised (Table S1). The choice of the metrics is system and problem-dependent, as the RMSD is most useful when the final state is known, while the distance is required when the target state is unknown; details on the scores are given below. The decision to restart or continue mwSuMD after any short simulation is postponed until all the walkers of a batch are collected. The best short simulation is selected and extended by seeding the same number of walkers, with the same duration as the step before.
For each walker, the score for the supervision of a single metric (SMscore) is computed as the square root of the product between the metric value in the last frame (Xlast frame) and the average metric value over the short simulation (X̅):
If the metric is set to decrease (e.g. binding or dimerization) the walker with the lowest SMscore is continued, otherwise (e.g. unbinding or outwards opening of domains), the walker with the highest score is continued. Using the SMscore rather than the slope should give more weight to the final state of each short simulation, as it is the starting point for the successive batch of simulations. Considering the average of the metric should favor short simulations consistently evolving in the desired direction along the metric.
If both X’ and X’’ are set to increase during the mwSuMD simulations, the score for the supervision of two metrics (DMscore) on each walker is computed as follows:
Where X’last frame and X’’last frame are the metrics values in the last frame, while X̅ ’batch walkers and X̅ ’’batch walkers represent the average value of the two metrics over all the walkers in the batch. Subtracting the value 1 to the metric ratio ensures that if one of the two metrics from the last frame (X’last frame or X’’last frame) is equal to the average (X̅ ’’batch walkers or X̅ ’’batch walkers) then that metric addend is null and DMscore depends only on the remaining metric. If any of the two metrics is set to decrease, then the corresponding component in Equation 2 is multiplied by −1 to maintain a positive score. Considering the average value of the two metrics over all the walkers rather than only over the considered walker should be more representative of the system evolution along the defined metric. In other words, the information about the metric is taken from all the walkers to better describe the evolution of the system.
The DMScore is designed to preserve some degree of independence between the two metrics supervised. Indeed, if the variation of one of them slows down and gets close to zero, the other metric is still able to drive the system’s evolution. It should be noted that DMScore works at its best if the two metrics have similar variations over time, as in the case of distance and RMSD (both of which are distance-based). Notably, differently from SuMD, when a walker is extended by seeding a new batch of short simulations and the remaining walkers are stopped, the atomic velocities are not reassigned. This allows the simulations to be as short as a few picoseconds if desired, without introducing artifacts due to the thermostat latency to reach the target temperature (usually up to 10-20 ps when a simulation is restarted reassigning the velocities of the atoms).
The current implementation of mwSuMD is for python3 and exploits MDAnalysis85 and MDTRaj86 modules.
Force field, ligands parameters, and general systems preparation
The CHARMM3687, 88/CGenFF 3.0.189–91 force field combination was employed in this work. Initial ligand force field, topology and parameter files were obtained from the ParamChem webserver89. Restrained electrostatic potential (RESP)92 partial charges were assigned to all the non-peptidic small molecules but adrenaline and guanosine-5’-diphosphate (GDP) using Gaussian09 (HF/6-31G* level of theory) and AmberTools20.
Six systems were prepared for MD (Table S1). Hydrogen atoms were added using the pdb2pqr93 and propka94 software (considering a simulated pH of 7.0); the protonation of titratable side chains was checked by visual inspection. The resulting receptors were separately inserted in a 1-palmitoyl-2-oleyl-sn-glycerol-3-phosphocholine (POPC) bilayer (previously built by using the VMD Membrane Builder plugin 1.1, Membrane Plugin, Version 1.1. at: http://www.ks.uiuc.edu/Research/vmd/plugins/membrane/), through an insertion method95. Receptor orientation was obtained by superposing the coordinates on the corresponding structure retrieved from the OPM database96. Lipids overlapping the receptor transmembrane helical bundle were removed and TIP3P water molecules97 were added to the simulation box by means of the VMD Solvate plugin 1.5 (Solvate Plugin, Version 1.5. at <http://www.ks.uiuc.edu/Research/vmd/plugins/solvate/). Finally, overall charge neutrality was reached by adding Na+/Cl- counter ions up to the final concentration of 0.150 M), using the VMD Autoionize plugin 1.3 (Autoionize Plugin, Version 1.3. at <http://www.ks.uiuc.edu/Research/vmd/plugins/autoionize/).
System equilibration and general MD settings
The MD engine ACEMD378 was employed for both the equilibration and productive simulations. The equilibration was achieved in isothermal-isobaric conditions (NPT) using the Berendsen barostat98 (target pressure 1 atm) and the Langevin thermostat99 (target temperature 300 K) with low damping of 1 ps-1. For the equilibration (integration time step of 2 fs): first, clashes between protein and lipid atoms were reduced through 1500 conjugate-gradient minimization steps, then a positional constraint of 1 kcal mol-1 Å-2 on all heavy atoms was gradually released over different time windows: 2 ns for lipid phosphorus atoms, 60 ns for protein atoms other than alpha carbon atoms, 80 ns for alpha carbon atoms; a further 20 ns of equilibration was performed without any positional constraints.
Productive trajectories (Table S1) were computed with an integration time step of 4 fs in the canonical ensemble (NVT). The target temperature was set at 300 K, using a thermostat damping of 0.1 ps-1; the M-SHAKE algorithm100, 101 was employed to constrain the bond lengths involving hydrogen atoms. The cut-off distance for electrostatic interactions was set at 9 Å, with a switching function applied beyond 7.5 Å. Long-range Coulomb interactions were handled using the particle mesh Ewald summation method (PME)102 by setting the mesh spacing to 1.0 Å.
Vasopressin binding simulations
The vasopressin 2 receptor (V2R) in complex with vasopressin (AVP) and the Gs protein103 was retrieved from the Protein Data Bank104 (PDB ID 7DW9). The Gs was removed from the system and the missing residues on ECL2 (G185-G189) were modeled from scratch using Modeller 9.19105. AVP was placed away from V2R in the extracellular bulk and the resulting system was prepared for MD simulations and equilibrated as reported above.
During SuMD simulations, the distance between the centroids of AVP residues C1-Q4 and V2R residues Q96, Q174, Q291, and L312 (Cα atoms only) was supervised over time windows of 600 ps or 100 ps (Table S1). MwSuMD simulations considered the same distance, the RMSD of AVP residues C1-Q4 to the experimental bound complex or the combination of the two during time windows of 600 ps (3 walkers) or 100 ps (10 walkers) (Table S1). Slope, SMscore, or DMscore (see Methods section MwSuMD protocol) was used in the different mwSuMD replicas performed (Table S1). Simulations were stopped after 300 ns (time window duration = 600 ps) or 50 ns (time window duration = 100 ps) of total SuMD or mwSuMD simulation time.
Vasopressin unbinding simulations
The V2R:AVP complex was prepared for MD simulations and equilibrated as reported above. During both SuMD and mwSuMD simulations (Table S1), the distance between the centroids of AVP residues C1-Q4 and V2R residues Q96, Q174, Q291, and L312 (Cα atoms only) was supervised over time windows of 100 ps (10 walkers seeded for mwSuMD simulations). Replicas were stopped when the AVP-V2R distance reached 40 Å.
Gs protein:β2 AR binding simulations
The model of the adrenergic β2 receptor (β2 AR) in an intermediate active state was downloaded from GPCRdb (https://gpcrdb.org/). The full agonist adrenaline (ALE) was inserted in the orthosteric site by superposition with the PDB ID 4LDO (fully active β2 AR)106. The structure of the inactive, GDP-bound Gs protein107 was retrieved from the Protein Data Bank104 (PDB ID 6EG8) and placed in the intracellular bulk. The resulting system (Gs > 50 Å away from (β2 AR) was prepared for MD simulations and equilibrated as reported above. The PDB ID 3SN6 (fully active β2 AR in complex with Gs41) was used as the reference for RMSD computations. Three mwSuMD replicas (Table S1) were performed supervising at the same time the distance between the helix 5 (α5) Gαs residues R385-L395 and the β2 AR residues V31-P330 as well as the RMSD of β2 AR TM6 residues C265-I278 (Cα atoms only) to the fully active state, during 100 ps time windows (5 walkers).
Membrane-anchored Gi protein:A1R simulations
Since the full-length structure of the inactive human Gi protein has not been yet resolved by X-ray or cryo-EM, it was modeled by superimposing the AlphaFold2108 AI models of the Gαi (P63096-F1), Gβ (Q9HAV0-F1), and Gγ (P50151-F1) subunits to the PDB file 6EG8 (a Gs heterotrimer). The resulting homotrimer (without GDP) was processed through Charmm-GUI109 to palmitoylate residue C3Gαi and geranylgeranylate residue C65Gγ 44, 110. The side chains of these two lipidated residues were manually inserted into a 120 x 120 Å POPC membrane and previously built by using the VMD Membrane Builder plugin 1.1, Membrane Plugin, Version 1.1. at: http://www.ks.uiuc.edu/Research/vmd/plugins/membrane/. Lipids overlapping the palmitoyl and geranylgeranyl groups were removed and TIP3P water molecules97 were added to the simulation box by means of the VMD Solvate plugin 1.5 (Solvate Plugin, Version 1.5. at <http://www.ks.uiuc.edu/Research/vmd/plugins/solvate/). Finally, overall charge neutrality was reached by adding Na+/Cl- counter ions up to the final concentration of 0.150 M), using the VMD Autoionize plugin 1.3 (Autoionize Plugin, Version 1.3. at <http://www.ks.uiuc.edu/Research/vmd/plugins/autoionize/). The first stage of equilibration was performed as reported above (Methods section System equilibration and general MD settings) for 120 ns, followed by a second stage in the NVT ensemble for a further 1 μs without any restraints to allow the membrane-anchored heterotrimeric Gi protein to stabilize within the intracellular side of the simulation box. After this two-stage, long equilibration, the active state A1R in complex with adenosine (PDB 6D9H) was manually inserted into the equilibrated membrane above the Gi protein using the corresponding structure retrieved from the OPM database as a reference, and the system further equilibrated for 120 ns as reported above (Methods section System equilibration and general MD settings). The A1R-Gi system was then subjected to both a 1 μs-long classic MD simulation and a mwSuMD simulation (Table S1). During the mwSuMD simulation, the RMSD of helix 5 (α5) Gαs residues 329-354 to the PDB 6D9H was supervised, seeding three walkers of 100 ps each until the productive simulation time reached 50 ns (total simulation time 150 ns).
GLP-1R:PF06882961 binding simulations
The inactive, ligand-free glucagon-like peptide receptor (GLP-1R) was retrieved from the Protein Data Bank104 (PDB ID 6LN2)111. Missing residues in the stalk and ICL2 were modeled with Modeller 9.19. The PF06882961 initial conformation was extracted from the complex with the fully active GLP-1R54 (PDB ID 7LCJ) and placed away from GLP-1R in the extracellular bulk. The resulting system was prepared for MD simulations and equilibrated as reported above. CGenFF dihedral force field parameters of PF06882961 with the highest penalties (dihedrals NG2R51-CG321-CG3C41-CG3C41 (penalty=143.5) and NG2R51-CG321-CG3C41-OG3C51 (penalty=152.4)) were optimized (Figure S13) employing Gaussian09 (geometric optimization and dihedral scan at HF/6-31g(d) level of theory) and the VMD force field toolkit plugin112.
Four classic MD replicas, for a total of 8 μs, were performed on the inactive, ligand-free receptor (prepared for MD simulations and equilibrated as reported above) to assess the possible binding path to the receptor TMD and therefore decide the initial position of PF06882961 in the extracellular bulk of the simulation box. A visual inspection of the trajectories suggested three major conformational changes that could allow ligand access to the TMD (Figure S13). Transitory openings of the ECD (distance Q47ECD - S310ECL2), TM6- TM7 (distance H3636.52 - F3907.45), and TM1-ECL1 (distance E1381.33 and W214ECL1) were observed. Since the opening of TM1-ECL1 was observed in two replicas out of four, we placed the ligand in a favorable position for crossing that region of GLP-1R.
MwSuMD simulations (Table S1) were performed stepwise to dock the ligand within GLP-1R first and then relax the receptor towards the active state. The PF06882961 binding was obtained by supervising at the same time the distance between the ligand and GLP-1R TM7 residues L3797.34-F3817.36, which are part of the orthosteric site (Cα atoms only), and the RMSD of the ECD (residues W33ECD-W120ECD, Cα atoms only) to the active state (PDB ID 7LCJ) until the former distance reached 4 Å. In the second phase of mwSuMD, the RMSD of the ECD (residues W33ECD-W120ECD, Cα atoms only) and the ECL1 to the active state (PDB ID 7LCJ) Cα atoms of residues M2042.74-L2243.27) were supervised until the latter reached less than 4 Å. During the third phase, the RMSD of PF06882961, as well as the RMSD of ECL3 (residues A3686.57-T3787.33, Cα atoms), were supervised until the former reached values lower than 3 Å. In the last mwSuMD step, only the RMSD of TM6 (residues I3456.34-F3676.56, Cα atoms) to the active state (PDB ID 7LCJ) was supervised until less than 5 Å.
Membrane-anchored Gs protein:GLP-1R simulations and GDP dissociation
The PDB 6EG8 was processed through Charmm-GUI109 to palmitoylate residue C3Gαi. The Gβ and Gγ subunits prepared as reported above were superimposed to the PDB file 6EG8 and the original Gβ and Gγ were removed. This step produced a heterotrimeric Gs protein palmitoylated at residue C3Gαs and geranylgeranylated at residue C65Gγ. The resulting system was inserted into a 120 x 120 Å POPC membrane and previously built by using the VMD Membrane Builder plugin 1.1, Membrane Plugin, Version 1.1. at: http://www.ks.uiuc.edu/Research/vmd/plugins/membrane. Lipids overlapping the palmitoyl and geranylgeranyl groups were removed and TIP3P water molecules97 were added to the simulation box by means of the VMD Solvate plugin 1.5 (Solvate Plugin, Version 1.5. at <http://www.ks.uiuc.edu/Research/vmd/plugins/solvate/). Finally, overall charge neutrality was reached by adding Na+/Cl- counter ions up to the final concentration of 0.150 M), using the VMD Autoionize plugin 1.3 (Autoionize Plugin, Version 1.3. at <http://www.ks.uiuc.edu/Research/vmd/plugins/autoionize/). The first stage of equilibration was performed as reported above (Methods section System equilibration and general MD settings) for 120 ns, followed by a second stage in the NVT ensemble for a further 1 μs without any restraints to allow the membrane-anchored heterotrimeric Gs protein to stabilize within the intracellular side of the simulation box. After this two-stage long equilibration, GLP-1R from the final frame of the activation simulation (in complex with PF06882961) was manually inserted into the equilibrated membrane above the Gs protein using the corresponding structure retrieved from the OPM database as a reference, and the system further equilibrated for 120 ns as reported above (Methods section System equilibration and general MD settings). The GLP-1R-Gs system was then subjected to three simulations (Table S1). Each mwSuMD replica was interrupted by 500 ns of classic MD twice, to relax the system during the transition. In the supervised stages, the distance between residues M386-L394 Gαs of helix 5 (α5) and the GLP-1R intracellular residues R1762.46, R3486.37, S3526.41, and N4057.60 was monitored, seeding three walkers of 200 ps each.
The AHD opening was simulated starting from the best GLP-1R:Gs binding mwSuMD replica (Replica 2 in Figure 4) by supervising the distance between AHD residues G70-R199 Gαs and K300-L394Gαs during three walkers of 100 ps each. 300 ns of classic MD was performed to relax the system. Finally, the GDP unbinding was supervised as the distance between GDP and residues E50, K52, T55, K293, and V367 of Gαs; five walkers were used in a 50 ps long mwSuMD simulation.
A2A:D2R heterodimerization
The inactive state A2AR and D2R were retrieved from the Protein Data Bank104 (PDB 5NM4 and 6LUQ, respectively)113, 114. Antagonists bound to the orthosteric site were removed and no modeling of the missing IC loops was attempted. A2AR and D2R were manually placed roughly 40 Å away from each other, on the plane of the membrane, orienting the two receptors to favor the dimerization through the interface formed by TM4 and TM5, as suggested by Borroto-Esquela D. O. et al.62 The resulting system was prepared for MD simulations and equilibrated as reported above.
The heterodimerization between A2AR and D2R was simulated with mwSuMD, seeding batches of three walkers with a duration of 100 ps each (Table S1). During each walker, the distance between TM5 of A2AR and D2R was supervised. At the same time, the distance between the centroids of A2AR and D2R was used as a collective variable for adiabatic MD115 (abMD) and well-tempered metadynamics116, 117 (wtMetaD) performed with Plumed 2.6118. For abMD, a distance target of 30 Å and a force constant of 10000 kJ*mol-1*Å-1) was used, while wtMetaD was performed by seeding Gaussian functions every 1 ps (sigma=1 Å; height=0.837 kJ*mol-1; T=310K) with a bias factor of 30. When the A2AR - D2R distance reached values lower than 40 Å and the first contacts between proteins were formed, the abMD was stopped and wtMetaD continued with a harmonic energy wall at 30 Å to avoid artificial crushing between the receptors due to the added energy bias. When the distance between A2AR and D2R was stable at about 30 Å, the collective variable biased by wtMetaD was set as the number of atomic contacts between A2AR and D2R, until reaching 200 ns of simulation. Finally, to relax the system and challenge the stability of the heterodimer formed during the biased mwSuMD simulation, a 1.5 μs classic MD simulation was performed.
A2AR-D2R heterobitopic ligand binding simulations
The A2AR-D2R heterobivalent ligand compound 2633 was parameterized as reported above and placed in the bulk solvent of the A2AR:D2R complex from the classic MD. Four mwSuMD replicas were collected supervising at the same time the distance between the A2A antagonist pyrazole-triazole-pyrimidine scaffold and the centroid of A2AR residues F168, N253, and A277 (Cα atoms) as well as the distance between the D2 antagonist 4-fluorobenzyl scaffold and the centroids of the Cα of D2R residues C118, F198, and V115 (Cα atoms). Ten walkers of 100 ps were simulated for every mwSuMD batch of replicas.
MD Analysis
Interatomic distances were computed through MDAnalysis85; root mean square deviations (RMSD) were computed using VMD119 and MDAnalysis85.
Interatomic contacts and ligand-protein hydrogen bonds were detected using the GetContacts scripts tool (https://getcontacts.github.io), setting a hydrogen bond donor-acceptor distance of 3.3 Å and an angle value of 120° as geometrical cut-offs. Contacts and hydrogen bond persistency are quantified as the percentage of frames (over all the frames obtained by merging the different replicas) in which protein residues formed contacts or hydrogen bonds with the ligand.
The MMPBSA.py120 script, from the AmberTools20 suite (The Amber Molecular Dynamics Package, at http://ambermd.org/), was used to compute molecular mechanics energies combined with the generalized Born and surface area continuum solvation (MM/GBSA) method or the molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) approach, after transforming the CHARMM psf topology files to an Amber prmtop format using ParmEd (documentation at <http://parmed.github.io/ParmEd/html/index.html). Supplementary Videos were produced employing VMD and avconv (at https://libav.org/avconv.html). Molecular graphics images were produced using the UCSF Chimera121 (v1.14).
Numbering system
Throughout the manuscript, the Ballesteros-Weinstein residues numbering system for class A122 and the Wootten residues numbering system for class B GPCRs123 are adopted.
Associated content
Supporting Information (PDF)
Supporting Videos S1-S8 (mpg).
All the MD trajectories (stripped of POPC, water molecules, and ions) and topology files (psf and pdb) are available here:
https://zenodo.org/record/7944479
The mwSuMD software version used in this study is available at:
Acknowledgements
GD is a member of the GPCRs-focused European COST action ERNEST. CAR is grateful for a Royal Society Industry Fellowship. GD and CAR are grateful for support from the BBSRC (BB/W016974/1) and from Diabetes UK (BDA 20/0006307).
Competing Interest
All authors declare that they have no conflicts of interest.
References
- (1)Deciphering the Complexity of Ligand-Protein Recognition Pathways Using Supervised Molecular Dynamics (SuMD) SimulationsJ. Chem. Inf. Model 56:687–705
- (2)A Supervised Molecular Dynamics Approach to Unbiased Ligand-Protein UnbindingJ. Chem. Inf. Model 60:1804–1817
- (3)The Evolution of the GPCR Signaling System in Eukaryotes: Modularity, Conservation, and the Transition to Metazoan MulticellularityGenome Biol. Evol 6:606–619
- (4)G Protein-Coupled Receptors as Targets for Approved Drugs: How Many Targets and How Many Drugs?Mol. Pharmacol 93:251–258
- (5)The GRAFS Classification System of G-Protein Coupled Receptors in Comparative PerspectiveGen. Comp. Endocrinol 142:94–101
- (6)G protein-coupled receptors | G protein-coupled receptors | IUPHAR/BPS Guide to PHARMACOLOGY https://www.guidetopharmacology.org/GRAC/FamilyDisplayForward?familyId=694 (accessed May 18, 2022).
- (7)Regulation, Signaling, and Physiological Functions of G-ProteinsJ. Mol. Biol 428:3850–3868
- (8)Biased Ligands of G Protein-Coupled Receptors (GPCRs): Structure-Functional Selectivity Relationships (SFSRs) and Therapeutic PotentialJ. Med. Chem 61:9841–9878
- (9)Selectivity Determinants of GPCR-G-Protein BindingNature 545:317–322
- (10)Molecular Dynamics Simulation for AllNeuron 99:1129–1143
- (11)Molecular Dynamics Simulations and Drug DiscoveryBMC Biol 9
- (12)Using Metadynamics to Explore Complex Free-Energy LandscapesNat. Rev. Phys
- (13)Accelerated Molecular Dynamics: A Promising and Efficient Simulation Method for BiomoleculesJ. Chem. Phys 120:11919–11929
- (14)Gaussian Accelerated Molecular Dynamics: Unconstrained Enhanced Sampling and Free Energy CalculationJ. Chem. Theory Comput 11:3584–3595
- (15)A Combined Activation Mechanism for the Glucagon ReceptorProc. Natl. Acad. Sci. USA
- (16)Mechanism of the G-Protein Mimetic Nanobody Binding to a Muscarinic G-Protein-Coupled ReceptorProc. Natl. Acad. Sci. USA 115:3036–3041
- (17)Weighted Ensemble Simulation: Review of Methodology, Applications, and SoftwareAnnu. Rev. Biophys 46:43–57
- (18)Supervised Molecular Dynamics (SuMD) as a Helpful Tool to Depict GPCR-Ligand Recognition Pathway in a Nanosecond Time ScaleJ. Chem. Inf. Model 54:372–376
- (19)Exploring Ligand Binding to Calcitonin Gene-Related Peptide ReceptorsFront. Mol. Biosci 8
- (20)Structure and Dynamics of the Active Gs-Coupled Human Secretin ReceptorNat. Commun 11
- (21)Supervised Molecular Dynamics for Exploring the Druggability of the SARS-CoV-2 Spike ProteinJ. Comput. Aided Mol. Des 35:195–207
- (22)Structural and Functional Diversity among Agonist-Bound States of the GLP-1 ReceptorNat. Chem. Biol
- (23)Selective Activation of Gαob by an Adenosine A1 Receptor Agonist Elicits Analgesia without Cardiorespiratory DepressionNat. Commun 13
- (24)Kinetic Model of GPCR-G Protein Interactions Reveals Allokairic Modulation of SignalingNat. Commun 13
- (25)Kinetic Aspects of the Interaction between Ligand and G Protein-Coupled Receptor: The Case of the Adenosine ReceptorsChem. Rev 117:38–66
- (26)The Role of Target Binding Kinetics in Drug DiscoveryChemMedChem 10:1793–1796
- (27)Adenosine A2A-Dopamine D2 Receptor-Receptor Heteromerization: Qualitative and Quantitative Assessment by Fluorescence and Bioluminescence Energy TransferJ. Biol. Chem 278:46741–46749
- (28)Dopamine Heteroreceptor Complexes as Therapeutic Targets in Parkinson’s DiseaseExpert Opin. Ther. Targets 19:377–398
- (29)Using Caffeine and Other Adenosine Receptor Antagonists and Agonists as Therapeutic Tools against Neurodegenerative Diseases: A ReviewLife Sci 101:1–9
- (30)Adenosine A2A Receptor Antagonists as New Agents for the Treatment of Parkinson’s DiseaseTrends Pharmacol. Sci 18:338–344
- (31)Heteroreceptor Complexes and Their Allosteric Receptor-Receptor Interactions as a Novel Biological Principle for Integration of Communication in the CNS: Targets for Drug DevelopmentNeuropsychopharmacology 41:380–382
- (32)Adenosine A2A Receptor-Antagonist/Dopamine D2 Receptor-Agonist Bivalent Ligands as Pharmacological Tools to Detect A2A-D2 Receptor HeteromersJ. Med. Chem 52:5590–5602
- (33)Heterobivalent Ligand for the Adenosine A2A-Dopamine D2 Receptor HeteromerJ. Med. Chem
- (34)Vasopressin ReceptorsTrends Endocrinol. Metab 11:406–410
- (35)Vasopressin and Disorders of Water Balance: The Physiology and Pathophysiology of VasopressinAnn Clin Biochem 44:417–431
- (36)Dynamic Docking: A Paradigm Shift in Computational Drug DiscoveryMolecules 22
- (37)GPCRdb: An Information System for G Protein-Coupled ReceptorsNucleic Acids Res 44:D356–64
- (38)Addressing Free Fatty Acid Receptor 1 (FFAR1) Activation Using Supervised Molecular DynamicsJ. Comput. Aided Mol. Des 34:1181–1193
- (39)Deciphering the Agonist Binding Mechanism to the Adenosine A1 ReceptorACS Pharmacol. Transl. Sci 4:314–326
- (40)A Multisite Model of Allosterism for the Adenosine A1 ReceptorBioRxiv
- (41)Crystal Structure of the Β2 Adrenergic Receptor-Gs Protein ComplexNature 477:549–555
- (42)Structure and Dynamics of Adrenomedullin Receptors AM1 and AM2 Reveal Key Mechanisms in the Control of Receptor Phenotype by Receptor Activity-Modifying ProteinsACS Pharmacol. Transl. Sci 3:263–284
- (43)Rules and Mechanisms Governing G Protein Coupling Selectivity of GPCRsCell Rep 42
- (44)Lipid Modifications of G Proteins: Alpha Subunits Are PalmitoylatedProc. Natl. Acad. Sci. USA 90:3675–3679
- (45)How a G Protein Binds a MembraneJ. Biol. Chem 279:33937–33945
- (46)The Full Activation Mechanism of the Adenosine A1 Receptor Revealed by GaMD and Su-GaMD SimulationsProc. Natl. Acad. Sci. USA 119
- (47)Activation of the GLP-1 Receptor by a Non-Peptidic AgonistNature 577:432–436
- (48)Structural Basis for GLP-1 Receptor Activation by LY3502970, an Orally Active Nonpeptide AgonistProc. Natl. Acad. Sci. USA 117:29959–29967
- (49)Structural Insights into the Activation of GLP-1R by a Small Molecule AgonistCell Res 30:1140–1142
- (50)Differential GLP-1R Binding and Activation by Peptide and Non-Peptide AgonistsMol. Cell 80:485–500
- (51)Molecular Insights into Ago-Allosteric Modulation of the Human Glucagon-like Peptide-1 ReceptorNat. Commun 12
- (52)Structural Basis of Peptidomimetic Agonism Revealed by Small Molecule GLP-1R Agonists Boc5 and WB4-24BioRxiv
- (53)Structural Perspective of Class B1 GPCR SignalingTrends Pharmacol. Sci 43:321–334
- (54)Evolving Cryo-EM Structural Approaches for GPCR Drug DiscoveryStructure 29:963–974
- (55)Universal Activation Index for Class A GpcrsJ. Chem. Inf. Model 59:3938–3945
- (56)Dynamics of GLP-1R Peptide Agonist Engagement Are Correlated with Kinetics of G Protein ActivationNat. Commun 13
- (57)Single-Molecule Analysis of Ligand Efficacy in Β2AR-G-Protein ActivationNature 547:68–73
- (58)Structural Basis for Nucleotide Exchange in Heterotrimeric G ProteinsScience 348:1361–1365
- (59)Simulation of Spontaneous G Protein Activation Reveals a New Intermediate Driving GDP UnbindingElife 7
- (60)A Conserved Hydrophobic Core in Gαi1 Regulates G Protein Activation and Release from Activated ReceptorJ. Biol. Chem 291:19674–19686
- (61)Universal Allosteric Mechanism for Gα Activation by GPCRsNature 524:173–179
- (62)Mapping the Interface of a GPCR Dimer: A Structural Model of the A2A Adenosine and D2 Dopamine Receptor HeteromerFront. Pharmacol 9
- (63)Lateral Diffusion of Membrane ProteinsJ. Am. Chem. Soc 131:12650–12656
- (64)WExplore: Hierarchical Exploration of High-Dimensional Spaces Using the Weighted Ensemble AlgorithmJ. Phys. Chem. B 118:3532–3542
- (65)Replica-Exchange Molecular Dynamics Method for Protein FoldingChem. Phys. Lett 314:141–151
- (66)End-Point Binding Free Energy Calculation with MM/PBSA and MM/GBSA: Strategies and Applications in Drug DesignChem. Rev 119:9478–9508
- (67)Membrane Interactions of G Proteins and Other Related ProteinsBiochim. Biophys. Acta 1778:1640–1652
- (68)The Pathway of Ligand Entry from the Membrane Bilayer to a Lipid G Protein-Coupled ReceptorSci. Rep 6
- (69)Entry from the Lipid Bilayer: A Possible Pathway for Inhibition of a Peptide G Protein-Coupled Receptor by a Lipophilic Small MoleculeBiochemistry
- (70)Conformational Dynamics of the Activated GLP-1 Receptor-Gs Complex Revealed by Cross-Linking Mass Spectrometry and Integrative Structure ModelingACS Cent. Sci
- (71)Lipid-Dependent GPCR DimerizationMethods Cell Biol 117:341–357
- (72)Cholesterol-Dependent Conformational Plasticity in GPCR DimersSci. Rep 6
- (73)Computational Approaches for Modeling GPCR DimerizationCurr. Pharm. Biotechnol 15:996–1006
- (74)The Adaptive Path Collective Variable: A Versatile Biasing Approach to Compute the Average Transition Path and Free Energy of Molecular TransitionsMethods Mol. Biol :255–290
- (75)Markov State Models: From an Art to a ScienceJ. Am. Chem. Soc 140:2386–2396
- (76)Machine Learning Analysis of ΤRAMD Trajectories to Decipher Molecular Determinants of Drug-Target Residence TimesFront. Mol. Biosci 6
- (77)Simulations Meet Machine Learning in Structural BiologyCurr. Opin. Struct. Biol 49:139–144
- (78)ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time ScaleJ. Chem. Theory Comput 5:1632–1639
- (79)Scalable Molecular Dynamics with NAMDJ. Comput. Chem 26:1781–1802
- (80)GROMACS: A Message-Passing Parallel Molecular Dynamics ImplementationComput Phys Commun 91:43–56
- (81)OpenMM 7: Rapid Development of High Performance Algorithms for Molecular DynamicsPLoS Comput. Biol 13
- (82)Estimation of Kinetic and Thermodynamic Ligand-Binding Parameters Using Computational StrategiesFuture Med. Chem 9:507–523
- (83)Exploring Protein-Peptide Recognition Pathways Using a Supervised Molecular Dynamics ApproachStructure 25:655–662
- (84)Molecular Signature for Receptor Engagement in the Metabolic Peptide Hormone AmylinACS Pharmacol. Transl. Sci 1:32–49
- (85)MDAnalysis: A Toolkit for the Analysis of Molecular Dynamics SimulationsJ. Comput. Chem 32:2319–2327
- (86)Mdtraj: A Modern Open Library for the Analysis of Molecular Dynamics TrajectoriesBiophys. J 109:1528–1532
- (87)CHARMM36 All-Atom Additive Protein Force Field: Validation Based on Comparison to NMR DataJ. Comput. Chem 34:2135–2145
- (88)CHARMM36m: An Improved Force Field for Folded and Intrinsically Disordered ProteinsNat. Methods 14:71–73
- (89)Automation of the CHARMM General Force Field (CGenFF) I: Bond Perception and Atom TypingJ. Chem. Inf. Model 52:3144–3154
- (90)Automation of the CHARMM General Force Field (CGenFF) II: Assignment of Bonded Parameters and Partial Atomic ChargesJ. Chem. Inf. Model 52:3155–3168
- (91)Extension of the CHARMM General Force Field to Sulfonyl-Containing Compounds and Its Utility in Biomolecular SimulationsJ. Comput. Chem 33:2451–2468
- (92)Restrained Electrostatic Potential Atomic Partial Charges for Condensed-Phase Simulations of CarbohydratesTheochem 527:149–156
- (93)PDB2PQR: An Automated Pipeline for the Setup of Poisson-Boltzmann Electrostatics CalculationsNucleic Acids Res 32:W665–7
- (94)PROPKA3: Consistent Treatment of Internal and Surface Residues in Empirical PK PredictionsJ. Chem. Theory Comput 7:525–537
- (95)Membrane Packing Problems: A Short Review on Computational Membrane Modeling Methods and ToolsComput Struct Biotechnol J 5
- (96)OPM: Orientations of Proteins in Membranes DatabaseBioinformatics 22:623–625
- (97)Comparison of Simple Potential Functions for Simulating Liquid WaterJ. Chem. Phys 79
- (98)Molecular Dynamics with Coupling to an External BathJ. Chem. Phys 81
- (99)Langevin Dynamics of Peptides: The Frictional Dependence of Isomerization Rates of N-Acetylalanyl-N’-MethylamideBiopolymers 32:523–535
- (100)SHAKE, Rattle, and Roll: Efficient Constraint Algorithms for Linked Rigid BodiesJ. Comput. Chem
- (101)A Fast SHAKE Algorithm to Solve Distance Constraint Equations for Small Molecules in Molecular Dynamics SimulationsJ. Comput. Chem 22:501–508
- (102)A Smooth Particle Mesh Ewald MethodJ. Chem. Phys 103
- (103)Molecular Basis of Ligand Recognition and Activation of Human V2 Vasopressin ReceptorCell Res 31:929–931
- (104)The Protein Data BankNucleic Acids Res 28:235–242
- (105)Modeller: Generation and Refinement of Homology-Based Protein Structure ModelsMeth. Enzymol 374:461–491
- (106)Adrenaline-Activated Structure of Β2-Adrenoceptor Stabilized by an Engineered NanobodyNature 502:575–579
- (107)Structural Insights into the Process of GPCR-G Protein Complex FormationCell 177:1243–1251
- (108)Highly Accurate Protein Structure Prediction with AlphaFoldNature 596:583–589
- (109)CHARMM-GUI: A Web-Based Graphical User Interface for CHARMMJ. Comput. Chem 29:1859–1865
- (110)Gγ and Gα Identity Dictate a G-Protein Heterotrimer Plasma Membrane TargetingCells 8
- (111)Full-Length Human GLP-1 Receptor Structure without Orthosteric LigandsNat. Commun 11
- (112)Rapid Parameterization of Small Molecules Using the Force Field ToolkitJ. Comput. Chem 34:2757–2770
- (113)Serial Millisecond Crystallography for Routine Room-Temperature Structure Determination at SynchrotronsNat. Commun 8
- (114)Haloperidol Bound D2 Dopamine Receptor Structure Inspired the Discovery of Subtype Selective LigandsNat. Commun 11
- (115)Adiabatic Bias Molecular Dynamics: A Method to Navigate the Conformational Space of Complex Molecular SystemsJ. Chem. Phys 110
- (116)Escaping Free-Energy MinimaProc. Natl. Acad. Sci. USA 99:12562–12566
- (117)Well-Tempered Metadynamics: A Smoothly Converging and Tunable Free-Energy MethodPhys. Rev. Lett 100
- (118)PLUMED 2: New Feathers for an Old BirdComput Phys Commun 185:604–613
- (119)VMD: Visual Molecular DynamicsJ Mol Graph 14:33–38
- (120)MMPBSA.Py: An Efficient Program for End-State Free Energy CalculationsJ. Chem. Theory Comput 8:3314–3321
- (121)UCSF Chimera—a Visualization System for Exploratory Research and AnalysisJ. Comput. Chem 25:1605–1612
- (122)[19] Integrated Methods for the Construction of Three-Dimensional Models and Computational Probing of Structure-Function Relations in G Protein-Coupled ReceptorsReceptor Molecular Biology Elsevier :366–428
- (123)Polar Transmembrane Interactions Drive Formation of Ligand-Specific and Signal Pathway-Biased Family B G Protein-Coupled Receptor ConformationsProc. Natl. Acad. Sci. USA 110:5211–5216
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2024, Deganutti et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.