Identification of ligand-specific G protein-coupled receptor states and prediction of downstream efficacy via data-driven modeling

  1. Oliver Fleetwood
  2. Jens Carlsson
  3. Lucie Delemotte  Is a corresponding author
  1. Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, Sweden
  2. Science for Life Laboratory, Department of Cell and Molecular Biology, Uppsala University, Sweden
6 figures, 2 tables and 1 additional file

Figures

Structure and microswitches of the β2 adrenergic receptor.

(a) A molecular dynamics (MD) snapshot of the β2 adrenergic receptor in complex with adrenaline in an active-like state (simulation starting from PDB 3P0G). The vignettes show the conformations of residue pairs reflecting important microswitches in the active and inactive structures 3P0G (color) and 2RH1 (white): the TM5 bulge (red), measured as the closest heavy atom distance between S207(5.46) and G315(7.41); the connector region’s conformational change (pink), measured as the difference in root-mean-square deviation (RMSD) between the active and inactive state structure of the residues I121(3.40) and F282(6.44); the Y-Y motif (black), measured as the C-ζ distance between Y219(5.58) and Y326(7.53) of the NPxxY motif; and the Ionic lock displacement (orange), measured as the closest heavy atom distance between E268(6.30) and R131(3.50). (b) Ligands examined in this study: agonists BI-167107 and adrenaline; biased partial agonist salmeterol; antagonists timolol and alprenolol; and the inverse agonist carazolol. Atoms are colored according to their partial charge.

Figure 2 with 11 supplements
Ligand-dependent free energy landscapes and expected downstream response.

(a) Free energy landscapes for the different ligands projected along the connector ΔRMSD microswitch. The inactive and active states are marked by R and R* respectively. (b) Free energy projected onto the TM5 bulge CV in the orthosteric site and the ionic lock distance (measuring TM6 displacement in the G protein-binding site). (c)–(f) Correlation between experimental values of downstream cyclic adenosine monophosphate (cAMP) signaling and the expectation value of different microswitches for the receptor bound to different ligands, (c) for the TM5 bulge, (d) for the connector ΔRMSD, (e) for the Y-Y motif, and (f) for the ionic lock distance. The cAMP Emax values for carazolol and BI-167107 (marked with an asterisk), which were not available in the experimental study, are inferred from the linear regression. Red dashed lines highlight the clustering of the agonist-bound structures.

Figure 2—figure supplement 1
Strings averaged over different iterations for the carazolol-bound receptor initiated from the active starting structure.

The four most important collective variables (CVs) are shown on the y-axis. The x-axis shows progression toward the inactive endpoint. Whiskers show the interquartile range for the last iterations; outliers are shown as circles. Note that although the other 39 CVs are not shown, they were of lower importance and their contribution to the drift of the string was smaller compared to those shown.

Figure 2—figure supplement 2
Strings averaged over different iterations for the carazolol-bound receptor initiated from the inactive starting structure.

The four most important collective variables are shown on the y-axis. The x-axis shows progression toward the inactive endpoint. Whiskers show the interquartile range for the last iterations; outliers are shown as circles.

Figure 2—figure supplement 3
Strings averaged over different iterations for the alprenolol-bound receptor initiated from the inactive starting structure.

The four most important collective variables are shown on the y-axis. The x-axis shows progression toward the inactive endpoint. Whiskers show the interquartile range for the last iterations; outliers are shown as circles.

Figure 2—figure supplement 4
Strings averaged over different iterations for the timolol-bound receptor initiated from the inactive starting structure.

The four most important collective variables are shown on the y-axis. The x-axis shows progression toward the inactive endpoint. Whiskers show the interquartile range for the last iterations; outliers are shown as circles.

Figure 2—figure supplement 5
Strings averaged over different iterations for the salmeterol-bound receptor initiated from the inactive starting structure.

The four most important collective variables are shown on the y-axis. The x-axis shows progression toward the inactive endpoint. Whiskers show the interquartile range for the last iterations; outliers are shown as circles.

Figure 2—figure supplement 6
Strings averaged over different iterations for the adrenaline-bound receptor initiated from the inactive starting structure.

The four most important collective variables are shown on the y-axis. The x-axis shows progression toward the inactive endpoint. Whiskers show the interquartile range for the last iterations; outliers are shown as circles.

Figure 2—figure supplement 7
Free energy landscapes projected along important microswitches.

(a) The TM5 bulge microswitch in the orthosteric site, (b) the connector ΔRMSD, (c) the D79-N322 distance in the transmembrane region, (d) the NPxxY motif’s (Cɑ positions of residues 322–327) RMSD to the inactive structure 2RH1, (e) the Y-Y motif in the intracellular binding site, and (f) the ionic lock displacement. The box plots show the median (as a horizontal line in orange), the interquartile range (as a box), the upper and lower whiskers (as vertical lines), and the outliers (as circles), based on 1000 transition matrices from MCMC sampling. The inactive and active states are marked by R and R*, respectively. Vertical dashed lines show the boundary between the active and inactive states.

Figure 2—figure supplement 8
Correlation between experimental values of downstream cyclic adenosine monophosphate (cAMP) signaling and the relative free energy of states for the receptor bound to different ligands.

(a) For the TM5 bulge microswitch, (b) for the connector ΔRMSD, (c) for the D79-N322 distance, (d) for the Y-Y motif microswitch, and (e) for the ionic lock distance. The cAMP values for carazolol and BI-167107 (marked with an asterisk) which were not available in the literature are inferred from the linear regression.

Figure 2—figure supplement 9
Comparison between the two pathways obtained for the carazolol-bound receptor.

(a) The average final average pathways along the two highest ranked collective variables (CVs). (b)–(d) Free energy landscapes from the two pathways. Left column: results from the 2RH1 (inactive structure)-initiated string. Right column: results from the 3P0G (active structure)-initiated string. States present in one system but not the other are highlighted by black boxes. (b) The connector ΔRMSD rotameric switch in the transmembrane (TM) region. (c) Free energy projected onto the TM5 bulge CV in the orthosteric site and the ionic lock distance in the G protein-binding site. (d) Free energy projected onto the TM6 backbone displacement, measured as the Cɑ distance between residues L272(6.34) and R131(3.50), and the ionic lock distance.

Figure 2—figure supplement 10
Correlation between experimental values of downstream cyclic adenosine monophosphate (cAMP) signaling and microswitch expectation value for the receptor bound to different ligands.

(a) For the TM5 bulge microswitch, (b) for the connector ΔRMSD, (c) for the Y-Y motif microswitch, and (d) for the ionic lock distance. Carazolol’s values were derived from the string with the inactive state starting structure, 2RH1.

Figure 2—figure supplement 11
Free energy landscapes for the receptor in its apo state (left column), bound to carazolol (center column), and bound to BI-167107 (right column).

These three conditions were also investigated in spectroscopy experiments, which we compare to throughout the article. (a) Free energy projected onto the TM6 backbone displacement, measured the Cɑ distance between residues L272(6.34) and R131(3.50), and the ionic lock distance. Black boxes highlight states where the Ionic lock side chains assume agonist-specific orientations either toward (for carazolol) or away (for BI-167107) from each other. (b) Free energy projected onto the hydration of the G protein-binding site, measured as the number of water molecules within 0.8 nm of L266(6.34) and the ionic lock distance. Black boxes highlight the agonist-specific active state with a dehydrated binding site.

Figure 3 with 4 supplements
Dimensionality reduction techniques applied to the active-like simulation ensembles.

Each point represents a simulation snapshot, colored according to the ligand bound to the receptor. Red dashed lines highlight regions where agonists cluster. The features are computed as the inverse closest heavy atom distances between residues. (a) Principal component analysis (PCA) projection onto the first two principal components (PCs), (b) multi-dimensional scaling (MDS) and (c) t-distributed stochastic neighbor embedding (t-SNE). (d) The similarity between conformations sampled when agonist (BI-167107, adrenaline and salmeterol) and non-agonist ligands are bound, measured as the average distance between configurations in the full feature space.

Figure 3—figure supplement 1
Dimensionality reduction of the activation pathways, applied to the activation paths derived from the swarms of trajectories method.

Every point represents a simulation snapshot, colored according to its index along the converged string with a marker specific to the bound ligand. The darker the color, the closer to the active state endpoint the snapshot is. The features are computed as the inverse closest heavy atom distances between residues. (a) The first two principal components (PCs) of principal component analysis (PCA), (b) PCA’s third and fourth PC, (c) multi-dimensional scaling (MDS) and (d) t-distributed stochastic neighbor embedding (t-SNE).

Figure 3—figure supplement 2
Equilibration of single states.

Distance in CV space (y-axis) between the center points of two subsequent iterations (x-axis). Eventually the center points diffuse very slowly between iterations, which indicates convergence. The shaded areas show the distance between iterations taking the standard deviation of the walkers’ distance to the center points into account.

Figure 3—figure supplement 3
Dimensionality reduction plots and similarity metrics for the kinetically trapped active-like states, using three subsets of the simulation frames in (a)–(c).

Each dataset contains a third of the total simulation snapshots.

Figure 3—figure supplement 4
State projection onto higher-order PCs.

Projection of the single equilibrated states onto the third and fourth principal components (PCs). Compared to the projection onto the first two components (Figure 3a), there is much less overlap between the different agonists and timolol.

Figure 4 with 4 supplements
Important residues for distinguishing ligand-dependent activation mechanisms.

(a)–(b) Residues identified to be important for classification of the equilibrated active-like states. Importance was derived by computing the Kullback–Leibler (KL) divergence along all features, followed by averaging per residue. (a) Comparison of agonists and non-agonists. One signaling hotspot is located at the transmembrane 5 (TM5) bulge and another on TM7 close to the NPxxY motif. (b) Important residues to discriminate between all ligands. The main hotspot is located near the NPxxY motif. (c) Important residues for the activation ensemble from the swarms of trajectories method, extracted with PCA. The importance per feature was computed as the product of the PC’s weights and the PC’s projection onto the input feature. The intracellular end of TM6, which undergoes a large conformational change upon activation, is marked as important. Inverse closest heavy atom distances were used as input features in all figures. (d) Conceptual model describing allosteric communication between the hotspots. Ligands exercise direct control of TM5, which is stabilized in different states by agonists and non-agonists. In turn, residues approximately one helical turn below the orthosteric site, including the connector region, couple to the conformations in the orthosteric site. This leads to distinct interaction patterns between TM6 and TM7 in the TM domain. The importance of TM7 is further enhanced by direct ligand interactions. By favoring distinct TM7 states and modulating the probability of stabilizing TM6 in an active conformation, ligands control the G protein-binding site.

Figure 4—figure supplement 1
Feature importance projected onto snakeplots.

(a)–(b) Residues identified to be important for the classification of equilibrated states. Importance was derived by computing the Kullback–Leibler (KL) divergence along all features, then averaging per residue. (a) Comparison of agonists to non-agonists. (b) Important residues to discriminate between all ligands. (c) Important residues for the activation ensemble from the swarms of trajectories method, extracted with principal component analysis (PCA). Inverse closest heavy atom distances were used as features in all figures.

Figure 4—figure supplement 2
Comparison between supervised feature extraction methods.

Comparison between (a) computing the Kullback–Leibler (KL) divergence per feature for identifying important residues and (b) a random forest (RF) classifier’s feature importance in the single equilibrated states. Inverse closest heavy atom distances were used as features.

Figure 4—figure supplement 3
Important activation features per ligand identified by applying unsupervised principal component analyis (PCA) on the activation paths.
Figure 4—figure supplement 4
Important features for equilibrated active-like states.

Important residues for discriminating individual equilibrated active-like states from each other.

Figure 5 with 1 supplement
Molecular basis for agonists’ control of receptor activation.

(a) Molecular basis for agonists’ control of receptor activation. S203(5.43) and S207(5.46) (red sticks) are part of the transmembrane 5 (TM5) bulge, which forms direct contacts with the ligand. V206(5.46) forms van der Waals interactions with T118(3.37) (red), which is located above I121(3.40) of the PIF motif in the connector region (pink sticks). TM6 and TM7, highlighted as L284(6.46) (orange sticks) close to F282(6.44) of the PIF motif and F321(7.48) (orange sticks) above the NPxxY motif, move closer together in the presence of agonists. Half a helix turn above F321(7.48), S319(7.46) forms a backbone interaction with the side chain of N51(1.50) for agonist-bound receptors, whereas water molecules interact with these residues for non-agonists. Together with TM7-ligand contacts in the orthosteric site, these interaction pathways stabilize the second important hotspot on TM7 close to the NPxxY motif (Y326(7.53) shown in black). (b) The T118(3.37)-V206(5.46) distance near the orthosteric site against the L284(6.46)-F321(7.48) distance in the TM region. Agonists contract both of these regions. (c) The distance across the orthosteric site between S207(5.461) and V307(7.33) (dark red in [a]) against the S203(5.43)-E338(8.56) distance across the TM domain. Agonists stabilize more compact receptor conformations. (d) The N51(1.50)-S319(7.46) distance against the L275(6.37)-Y326(7.53) distance. Agonists share the common feature of stabilizing the N51(1.50)-S319(7.46) backbone interaction, but form different NPxxY orientations, shown as the distance from Y326(7.53) to L275(6.37). (e) The three agonists stabilize slightly different TM6 and TM7 orientations, here illustrated by the distance between L275(6.37) and Y326(7.53). Adrenaline (purple) induces an active-like NPxxY motif, whereas BI-167107 (dark blue) stabilizes an inactive-like motif. The salmeterol-bound receptor (slate gray) adopts a distinct Y326(7.53) orientation.

Figure 5—figure supplement 1
Three distinct NPxxY conformations stabilized by salmeterol.

Simulation snapshots of salmeterol assuming three distinct conformations of the NPxxY motif: an inactive conformation (white), an active-like conformation (orange), and a novel conformation (blue). (a) The receptor from the transmembrane view. (b) The interaction between salmeterol and S207(5.46), lost in the novel state. (c) The orientation of N322 (7.49) and Y326 of the NPxxY motif. (d) Scatter plot of snapshots from the 300th string iteration. The interaction between salmeterol and S207(5.46) and S203(5.43) is only lost when the NPxxY motif assumes a unique conformation (measured as the backbone RMSD for residues 322–327 to the inactive structure 2RH1).

Author response image 1

Tables

Table 1
Total simulation time per string of swarm simulation ensemble*.
LigandSteered molecular dynamics simulation time [µs]#Restrained equilibration trajectories (30 ps each)#Swarm trajectories (10 ps each)Total simulation time [µs]
Carazolol0.2148783528164.17
Alprenolol0.2148783648564.29
Timolol0.2148783638084.28
Salmeterol0.2148783632324.28
Adrenaline0.2148783729364.38
  1. * The previously published apo and BI-167107 initiated systems followed a slightly different initialization protocol with three substrings (Fleetwood et al., 2020b) and have been excluded from the table.

Table 2
String simulations: collective variables.
ResiduesImportance
F223(5.62)-A271(6.33)1.0
Q224(5.63)-K227(5.66)0.97
F223(5.62)-L272(6.34)0.76
I325(7.52)-R328(7.55)0.73
F223(5.62)-K227(5.66)0.72
A226(5.65)-K267(6.29)0.69
V54(1.53)-C327(7.54)0.67
L324(7.51)-R328(7.55)0.67
A134(3.53)-Y1410.66
I135(3.54)-L272(6.34)0.6
V222(5.61)-A271(6.33)0.59
A226(5.65)-E268(6.30)0.59
Q26(1.25)-D29(1.28)0.57
I135(3.54)-E225(5.64)0.57
R131(3.50)-L275(6.37)0.57
A134(3.53)-A271(6.33)0.56
C285(6.47)-V317(7.43)0.56
A76(2.47)-P323(7.50)0.56
Q26(1.25)-E30(1.29)0.55
A226(5.65)-A271(6.33)0.54
I121(3.40)-F208(5.47)0.53
E338(8.56)-R3430.53
I334(8.52)-R3440.52
G50(1.49)-L324(7.51)0.49
T25-D29(1.28)0.47
I135(3.54)-A271(6.33)0.46
T25-E30(1.29)0.46
W286(6.48)-G315(7.41)0.45
C285(6.47)-N318(7.45)0.44
Q27(1.26)-E30(1.29)0.44
R63-D331(8.49)0.43
Q197(5.37)-V297(6.59)0.43
A134(3.53)-E268(6.30)0.42
P288(6.50)-L311(7.37)0.42
T281(6.43)-N318(7.45)0.42
C341(8.59)-R3440.42
I135(3.54)-P1380.4
R328(7.55)-R333(8.51)0.36
L311(7.37)-G315(7.41)0.36
I135(3.54)-E268(6.30)0.35
C285(6.47)-I314(7.40)0.34

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Oliver Fleetwood
  2. Jens Carlsson
  3. Lucie Delemotte
(2021)
Identification of ligand-specific G protein-coupled receptor states and prediction of downstream efficacy via data-driven modeling
eLife 10:e60715.
https://doi.org/10.7554/eLife.60715