1. Biochemistry and Chemical Biology
  2. Structural Biology and Molecular Biophysics
Download icon

Mapping transiently formed and sparsely populated conformations on a complex energy landscape

  1. Yong Wang
  2. Elena Papaleo
  3. Kresten Lindorff-Larsen  Is a corresponding author
  1. University of Copenhagen, Denmark
Research Article
  • Cited 24
  • Views 1,845
  • Annotations
Cite this article as: eLife 2016;5:e17505 doi: 10.7554/eLife.17505

Abstract

Determining the structures, kinetics, thermodynamics and mechanisms that underlie conformational exchange processes in proteins remains extremely difficult. Only in favourable cases is it possible to provide atomic-level descriptions of sparsely populated and transiently formed alternative conformations. Here we benchmark the ability of enhanced-sampling molecular dynamics simulations to determine the free energy landscape of the L99A cavity mutant of T4 lysozyme. We find that the simulations capture key properties previously measured by NMR relaxation dispersion methods including the structure of a minor conformation, the kinetics and thermodynamics of conformational exchange, and the effect of mutations. We discover a new tunnel that involves the transient exposure towards the solvent of an internal cavity, and show it to be relevant for ligand escape. Together, our results provide a comprehensive view of the structural landscape of a protein, and point forward to studies of conformational exchange in systems that are less characterized experimentally.

https://doi.org/10.7554/eLife.17505.001

eLife digest

Proteins are the workhorses of cells, where they perform a wide range of roles. To do so, they adopt specific three-dimensional structures that enable them to interact with other molecules as necessary. Often a protein needs to be able to shift between different states with distinct structures as it goes about its job. To fully understand how a protein works, it is important to be able to characterize these different structures and how the protein changes between them.

Many of the experimental techniques used to study protein structure rely on isolating the individual structural forms of a protein. Since many structures only exist briefly, this can be very difficult. To complement experimental results, computer simulations allow researchers to model how atoms behave within a molecule. However, a number of factors limit how well these models represent what happens experimentally, such as the accuracy of the physical description used for the modeling.

Wang et al. set out to test and benchmark how well computer simulations could model changes in structure for a protein called T4 lysozyme, which has been studied extensively using experimental techniques. T4 lysozyme exists in two different states that have distinct structures. By comparing existing detailed experimental measurements with the results of their simulations, Wang et al. found that the simulations could capture key aspects of how T4 lysozyme changes its shape.

The simulations described the structure of the protein in both states and accurately determined the relative proportion of molecules that are found in each state. They could also determine how long it takes for a molecule to change its shape from one state to the other. The findings allowed Wang et al. to describe in fine detail – down to the level of individual atoms – how the protein changes its shape and how mutations in the protein affect its ability to do so. A key question for future studies is whether these insights can be extended to other proteins that are less well characterized experimentally than T4 lysozyme.

https://doi.org/10.7554/eLife.17505.002

Introduction

Proteins are dynamical entities whose ability to change shape often plays essential roles in their function. From an experimental point of view, intra-basin dynamics is often described via conformational ensembles whereas larger scale (and often slower) motions are characterized as a conformational exchange between distinct conformational states. The latter are often simplified as a two-site exchange process, GE, between a highly populated ground (G) state, and a transiently populated minor (or ‘excited’, E) state. While the structure of the ground state may often be determined by conventional structural biology tools, it is very difficult to obtain atomic-level insight into minor conformations due to their transient nature and low populations. As these minor conformations may, however, be critical to protein functions, including protein folding, ligand binding, enzyme catalysis, and signal transduction (Mulder et al., 2001; Tang et al., 2007; Baldwin and Kay, 2009) it is important to be able to characterize them in detail. While it may in certain cases be possible to capture sparsely populated conformations in crystals under perturbed experimental conditions, or to examine their structures by analysis of electron density maps (Fraser et al., 2009), NMR spectroscopy provides unique opportunities to study the dynamical equilibrium between major and minor conformations (Baldwin and Kay, 2009; Sekhar and Kay, 2013) via e.g. chemical-exchange saturation transfer (Vallurupalli et al., 2012), Carr-Purcell-Meiboom-Gill (CPMG) relaxation dispersion (Hansen et al., 2008), or indirectly via paramagnetic relaxation enhancement (Tang et al., 2007) or residual dipolar coupling (Lukin et al., 2003) experiments. In favourable cases such experiments can provide not only thermodynamic and kinetic information (i.e. the population of G and E states and the rate of exchange between them), but also structural information in the form of chemical shifts (CS), that can be used to determine the structure of the transiently populated state (Sekhar and Kay, 2013).

Despite the important developments in NMR described above, it remains very difficult to obtain structural models of minor conformations, and a substantial amount of experiments are required. Further, it is generally not possible to use such experiments to infer the mechanisms of interconversion, and to provide a more global description of the multi-state free energy landscape (Zhuravlev and Papoian, 2010; Wang et al., 2012). In the language of energy landscape theory (Onuchic et al., 1997), free energy basins and their depths control the population and stability of functionally distinct states, while the relative positions of basins and the inter-basin barrier heights determine the kinetics and mechanism of conformational exchange. As a complement to experiments, such functional landscapes can be explored by in silico techniques, such as molecular dynamics (MD) simulations, that may both be used to help interpret experimental data and provide new hypotheses for testing (Karplus and Lavery, 2014; Eaton and Muñoz, 2014). Nevertheless, the general applicability of simulation methods may be limited by both the accuracy of the physical models (i.e. force fields) used to describe the free energy landscape and our ability to sample these efficiently by computation. We therefore set out to benchmark the ability of simulations to determine conformational free energy landscapes.

The L99A variant of lysozyme from the T4 bacteriophage (T4L) has proven an excellent model system to understand protein structure and dynamics. Originally designed a ‘cavity creating’ variant to probe protein stability (Eriksson et al., 1992b) it was also demonstrated that the large (150 Å3) internal cavity can bind hydrophobic ligands such as benzene (Eriksson et al., 1992a; Liu et al., 2009). It was early established that the cavity is inaccessible to solvent in the ground state, but that ligand binding is rapid (Feher et al., 1996), suggesting protein dynamics to play a potential role in the binding process. This posts a long-standing question of how the ligands gain access to the buried cavity (Mulder et al., 2000; López et al., 2013; Merski et al., 2015; Miao et al., 2015).

NMR relaxation dispersion measurements of L99A T4L demonstrated that this variant, but not the wild type protein, displayed conformational exchange on the millisecond timescale between the ground state and a minor state populated at around 3% (at room temperature) (Mulder et al., 2001). Such small populations generally lead only to minimal perturbations of ensemble-averaged experimental quantities making structural studies difficult, and hence it was difficult to probe whether the exchange process indeed allowed for ligand access to the cavity. A series of additional relaxation dispersion experiments, however, made it possible to obtain backbone and side chain CSs of the minor E state of L99A (Mulder et al., 2002; Bouvignies et al., 2011). The backbone CS data were subsequently used as input to a CS-based structure refinement protocol (CS-ROSETTA) to produce a structural model of the E state (EROSETTA; Figure 1) of the L99A mutant (Bouvignies et al., 2011). This model was based in part on the crystal structure of the ground state of L99A (referred to in what follows as GXray), but perturbing the structure in regions that the experiments demonstrated to undergo conformational change in a way so that the final model (EROSETTA) agrees with experiments. The structure was further validated by creating and solving the structure of a triple mutant variant that inverts the populations of the G and E states. The EROSETTA structure revealed substantial local rearrangements in T4L L99A, in particular near the cavity which gets filled by the side chain of a phenylalanine at position 114 (F114). Because the cavity is filled and solvent inaccessible in the E-state, the structure did, however, not reveal how ligands might access the cavity.

Structures of the major G and minor E states of L99A T4L and the hidden state hypothesis.

The X-ray structure of the G state (GXray; PDB ID code 3DMV) has a large internal cavity within the core of the C-terminal domain that is able to bind hydrophobic ligands. The structure of the E state (EROSETTA; PDB ID code 2LC9) was previously determined by CS-ROSETTA using chemical shifts. The G and E states are overall similar, apart from the region surrounding the internal cavity. Comparison of the two structures revealed two remarkable conformational changes from G to E: helix F (denoted as HF) rotates and fuses with helix G (HG) into a longer helix, and the side chain of phenylalanine at position 114 (F114) rotates so as to occupy part of the cavity. As the cavity is inaccessible in both the GXray and EROSETTA structures it has been hypothesized that ligand entry occurs via a third ‘cavity open’ state (Merski et al., 2015).

https://doi.org/10.7554/eLife.17505.003

In an attempt to benchmark the ability of simulations to map conformational free energy landscapes, we have here employed a series of in silico experiments designed to probe the structure and dynamics of L99A T4L and have compared the results to NMR measurements. We used enhanced-sampling MD simulations in explicit solvent and with state-of-the-art force fields to map the free-energy landscape including the exchange between the major and minor conformations of the protein. We used a series of recently developed metadynamics methods (Laio and Parrinello, 2002) to sample the conformational exchange process and associated structure and thermodynamics, as well as to determine the kinetics and mechanisms of exchange. We obtained additional insight into the structural dynamics of the E state using simulations that employed the experimental CSs as replica-averaged restraints. Our results provide a coherent picture of the conformational dynamics in L99A and extend the insights obtained from recent simulations of a triple mutant of T4L (Vallurupalli et al., 2016), by providing new information about the mechanisms of exchange and the transient exposure of the internal cavity. Together with previous results for Cyclophilin A (Papaleo et al., 2014) the results described here reiterate how simulation methods have now reached a stage where they can be used to study slow, conformational exchange processes such as those probed by NMR relaxation dispersion even in cases where less information is available from experiments.

Results and discussion

Mapping the free-energy landscape

As the average lifetime of the G and E states are on the order of 20–50 ms and 1 ms, respectively (Mulder et al., 2001, 2002; Bouvignies et al., 2011), direct and reversible sampling of the G-E transition at equilibrium would be extremely demanding computationally. Indeed, a recent set of simulations of a triple mutant of T4L, which has a substantially faster kinetics, was able only to observe spontaneous transitions in one direction (Vallurupalli et al., 2016). We therefore resorted to a set of flexible and efficient enhanced sampling methods, collectively known as ‘metadynamics’ (Laio and Parrinello, 2002), that have previously been used in a wide range of applications. In metadynamics simulations, a time-dependent bias is continuously added to the energy surface along a small number of user-defined collective variables (CVs). In this way, sampling is enhanced to reach new regions of conformational space and at the same time allows one to reconstruct the (Boltzmann) free-energy surface. The success of the approach hinges on the ability to find a set of CVs that together describe the slowly varying degrees of freedom and map the important regions of the conformational landscape.

We first performed a set of metadynamics simulations in the well-tempered ensemble (Barducci et al., 2008) using so-called path CVs (Spath and Zpath) (Branduardi et al., 2007; Saladino et al., 2012) with the aid of recently developed adaptive hills to aid in the convergence of the sampling (Branduardi et al., 2012; Dama et al., 2014) (see details in Appendix and Appendix 1—table 1). In short, the Spath variable describes the progress of the conformational transition between the GXray and EROSETTA structures with additional ‘interpolation’ using an optimal ‘reference’ path in a simplified model (see details in Appendix and Figure 2—figure supplement 1), while Zpath measures the distance to this reference path. In this way, the two-dimensional free energy landscape along Spath and Zpath provides a useful description on conformational exchange between ground and excited states that does not assume that the initial reference path describes perfectly the actual path(s) taken.

Projecting the sampled free energy landscape along Spath (upper panel of Figure 2) reveals a deep, narrow free energy basin around Spath=0.2 (labeled by red sphere and corresponding to the G state), and a broader, shallow free energy basin with Spath ranging from 0.6 to 0.8 (labeled by blue sphere and corresponding to the E state). Additional information is obtained from the two-dimensional landscape (shown as a negative free energy landscape, -F(Spath, Zpath), in the lower panel of Figure 2) which reveals a complex and rough landscape with multiple free energy minima (corresponding to mountains in the negative free energy landscape). Subsequently, structural inspection of these minima identified that the conformations in the basins around Spath=0.2 and Spath=0.75 correspond to the structures of GXray and EROSETTA, respectively.

Figure 2 with 2 supplements see all
Free energy landscape of the L99A variant of T4L.

In the upper panel, we show the projection of the free energy along Spath, representing the Boltzmann distribution of the force field employed along the reference path. Differently colored lines represent the free energy profiles obtained at different stages of the simulation, whose total length was 667ns. As the simulation progressed, we rapidly found two distinct free energy basins, and the free energy profile was essentially constant during the last 100 ns of the simulation. Free energy basins around Spath=0.2 and Spath=0.75 correspond to the previously determined structures of the G- and E-state, respectively (labelled by red and blue dots, respectively). As discussed further below, the E-state is relatively broad and is here indicated by the thick, dark line with Spath ranging from 0.55 to 0.83. In the lower panel, we show the three-dimensional negative free energy landscape, -F(Spath, Zpath), that reveals a more complex and rough landscape with multiple free energy minima, corresponding to mountains in the negative free energy landscape. An intermediate-state basin around Spath=0.36 and Zpath=0.05 nm2, which we denote I0.36, is labeled by a yellow dot.

https://doi.org/10.7554/eLife.17505.004

The broad nature of the free energy landscape in the region of the minor state is consistent with the observation that our MD simulations initiated from EROSETTA display significant conformational fluctuations (RUN20 and RUN22 in Appendix 1—table 1). Furthermore, our metadynamics simulations revealed multiple local free energy minima adjacent to the EROSETTA basin, together composing a wider basin (highlighted by the black curve in Figure 2). Thus, these simulations suggest that the E state displays substantial conformational dynamics, a result corroborated by simulations that have been biased by the experimental data (see section ‘Simulations of the minor state using chemical shift restraints’).

In addition to free-energy minima corresponding to the G and E states, we also found a free energy minimum around Spath=0.36 and Zpath=0.05 nm2 (denoted as I0.36 and labeled by a yellow sphere in Figure 2) that is located between the G and E states on the one-dimensional free-energy surface. We note, however, that it is difficult to infer dominant reaction pathways from such free energy surfaces, and so from this data alone, we cannot determine whether I0.36 occurs as an intermediate in G-E conformational transitions. Indeed, it appears from the two-dimensional surface that there exist multiple possible pathways between G and E, as illustrated by white lines along the mountain ridges of the negative free energy landscape in the lower panel of Figure 2. (We also explored the mechanism of exchange by reconnaissance metadynamics simulations (Tribello et al., 2011), the results of which are described and discussed further below.)

Effect of mutations on the free energy landscape

Based on the encouraging results above for L99A T4L, we examined whether simulations could also capture the effect of mutations on the free energy landscape. Using Rosetta energy calculations on the GXray and EROSETTA structures it was previously demonstrated that two additional mutations, G113A and R119P, when introduced into the L99A background, cause an inversion in the populations of the two states (Bouvignies et al., 2011; Vallurupalli et al., 2016). Indeed, NMR data demonstrated that the triple mutant roughly inverts the populations of the two states so that the minor state structure (of L99A) now dominates (with a 96% population) the triple mutant. We repeated the calculations described above for L99A also for the triple mutant. Remarkably, the free energy profile of the triple mutant obtained using metadynamics simulations reveals a free energy landscape with a dominant minimum around Spath=0.7 and a higher energy conformation around Spath=0.15 (Figure 2—figure supplement 2). Thus, like our previous observations for a ‘state-inverting mutation’ in Cyclophilin A (Papaleo et al., 2014), we find here that the force field and the sampling method are sufficiently accurate to capture the effect of point mutations on the free energy landscape. Further, we note that the barrier height for the conformational exchange in the triple mutant is very similar to the value recently estimated using a completely orthogonal approach (Vallurupalli et al., 2016). Finally, we attempted to determine the free energy landscape of the L99A,G113A double mutant, which has roughly equal populations of the two states (Bouvignies et al., 2011), but this simulation did not converge on the simulation timescales at which the two other variants converged.

Calculating conformational free energies

With a free-energy surface in hand and a method to distinguish G- and E-state conformations, we calculated the free energy difference, ΔG, between the two conformational states, and compared with the experimental values. We divided the global conformational space into two coarse-grained states by defining the separatrix at Spath=0.46 which corresponds to a saddle point on the free energy surface, on the basis of the observations above that the E state is relatively broad. Although a stricter definition of how to divide the reaction coordinate certainly helps the precise calculation, here we just used this simple definition to make an approximate estimation of the free energy difference. Further, since the barrier region is sparsely populated, the exact point of division has only a modest effect on the results. By summing the populations on the two sides of the barrier we calculated ΔG as a function of the simulation time (Figure 3). Initially during the simulations the free energy profile varies substantially (Figure 2) and the free energy difference equally fluctuates. As the simulations converge, however, the free energy difference between the two states stabilize to a value at approximately ΔG=3.5 kcal mol−1 (Figure 3, black line). This value can be compared to the value of 2.1 kcal mol−1 obtained from NMR relaxation dispersion experiments (Mulder et al., 2001), revealing reasonably good, albeit not exact, agreement with the experiments.

Estimation of free energy differences and comparison with experimental measurements.

We divided the global conformational space into two coarse-grained states by defining the separatrix at Spath=0.46 (0.48 for the triple mutant) in the free energy profile (Figure 2—figure supplement 2) which corresponds to a saddle point of the free energy surface, and then estimated the free energy differences between the two states (ΔG) from their populations. The time evolution of ΔG of L99A (upper time axis) and the triple mutant (lower axis) are shown as black and blue curves, respectively. The experimentally determined values (2.1 kcal mol−1 for L99A and −1.9 kcal mol−1 for the triple mutant) are shown as yellow lines.

https://doi.org/10.7554/eLife.17505.007

Similar calculations using the simulations of the triple mutant also converge, in this case to about −1.6 kcal mol−1 (Figure 3, blue line), in excellent agreement with the experimental measurement (−1.9 kcal mol−1) (Bouvignies et al., 2011). Combining these two free energy differences we find that the G113A, R119P mutations cause a shift in the G-E free energy of 5.1 kcal mol−1 in simulations compared to 4.0 kcal mol−1 obtained by experiments. Thus, we find that the simulations with reasonably high accuracy are able to capture the thermodynamics of the conformational exchange between the two states. While the generality of such observations will need to be established by additional studies, we note here that comparably good agreement was obtained when estimating the effect of the S99T mutations in Cyclophilin A (Papaleo et al., 2014).

In our previous work on Cyclophilin A (Papaleo et al., 2014) we sampled the conformational exchange using parallel-tempering metadynamics simulations (Bonomi and Parrinello, 2010) using four CVs that we chose to describe the structural differences between the G and E states in that protein. We note here that we also tried a similar approach here but unfortunately failed to observe a complete G-to-E transition, even in a relatively long trajectory of about 1 μs per replica (CVs summarized in Appendix 1—table 2, parameters shown in Appendix 1—table 1). This negative result is likely due to the CVs chosen did not fully capture the relevant, slowly changing degrees of freedom, thus giving rise to insufficient sampling even with the use of a parallel tempering scheme.

Calculating the rates of conformational exchange

Enhanced-sampling simulations such as those described above provide an effective means of mapping the free-energy landscape and hence the structural and thermodynamic aspects of conformational exchange. While the same free-energy landscape also determines the kinetics and mechanisms of exchange it may be more difficult to extract this information from e.g. path-CV-based metadynamics (PathMetaD) simulations. To examine how well simulations can also be used to determine the rates of the G-to-E transitions, quantities that can also be measured by NMR, we used the recently developed ‘infrequent metadynamics’ method (InMetaD, see details in Appendix) (Tiwary and Parrinello, 2013; Salvalaglio et al., 2014; Tiwary et al., 2015a; 2015b). Briefly described, the approach calculates first-passage times for the conformational change in the presence of a slowly-added bias along a few CVs, here chosen as the path CVs also used to map the landscape. By adding the bias slowly (and with lower amplitude) we aim to avoid biasing the transition-state region and hence to increase the rate only by lowering the barrier height; in this way it is possible to correct the first-passage times for the bias introduced.

Using this approach on L99A T4L we collected 42 and 36 independent trajectories with state-to-state transition starting from either the G state or E state, respectively (Appendix 1—figure 1 and Appendix 1—figure 2 ). The (unbiased) rates that we calculated (Table 1 and Appendix 1—figure 3) are in good agreement with the experimental rates (Mulder et al., 2001; Bouvignies et al., 2011) (within a factor of 10), corresponding to an average error of the barrier height of 1 kcal mol−1. We also performed similar calculations for the ‘population-inverting’ triple mutant, where we collected 30 transitions (15 for each direction) using InMetaD simulations. As for L99A, we also here find similarly good agreement with experimental measurements (Vallurupalli et al., 2016) (Table 1 and Appendix 1—figure 4). We estimated the reliability of this computational approach using a Kolmogorov-Smirnov test to examine whether the first-passage times conform to the expected Poisson distribution (Salvalaglio et al., 2014), and indeed the results of this analysis suggest good agreement (Table 1, Appendix 1—figure 5 and Appendix 1—figure 6).

Table 1

Free energy differences and rates of conformational exchange.

https://doi.org/10.7554/eLife.17505.008
τGE (ms)τEG (ms)ΔG (kcal mol−1)
L99A
NMR200.72.1
InMetaD175 ± 561.4 ± 0.62.9 ± 0.5
PathMetaD3.5
L99A/G113A/R119P
NMR0.24-1.9
InMetaD2.0 ± 1.714.3 ± 8.3-1.2 ± 1.1
PathMetaD-1.6

The ability to calculate forward and backward rates between G and E provided us with an alternative and independent means to estimate the free energy difference between the two states (Table 1), and to test the two-state assumption used in the analysis of the experimental NMR data. We therefore calculated the free energy difference from the ratio of the forward and backward reaction rates. The values obtained (2.9 ± 0.5 kcal mol−1 and −1.2 ± 1.1 kcal mol−1 for L99A and the triple mutant, respectively) are close both to the values obtained above from the equilibrium free energy landscape (3.5 kcal mol−1 and −1.6 kcal mol−1) and experiment (2.1 kcal mol−1 and −1.9 kcal mol−1). In particular, the relatively close agreement between the two independent computational estimates lends credibility both to the free energy landscape and the approach used to estimate the kinetics. The observation that both values for L99A are slightly larger than the experimental number suggests that this discrepancy (ca. 1 kcal mol−1) can likely be explained by remaining force field deficiencies rather than lack of convergence or the computational approach used.

Simulations of the minor state using chemical shift restraints

While the simulations described above used available structural information of G and E states to guide and enhance conformational sampling, the resulting free energy surfaces represent the Boltzmann distributions of the force field and are not otherwise biased by experimental data. To further refine the structural model of the E state we used the relaxation-dispersion derived CSs that were used to determine of EROSETTA [BMRB (Ulrich et al., 2008) entry 17604] as input to restrained MD simulations. In these simulations, we used the experimental data as a system-specific force field correction to derive an ensemble of conformations that is compatible both with the force field and the CSs. Such replica-averaged simulations use the experimental data in a minimally-biased way that is consistent with the Principle of Maximum Entropy (Pitera and Chodera, 2012; Roux and Weare, 2013; Cavalli et al., 2013; Boomsma et al., 2014).

We performed CS-restrained MD simulations of the E state of L99A averaging the CSs over four replicas. Although the number of replicas is a free parameter, which should in principle be chosen as large as possible, it has been demonstrated that four replicas are sufficient to reproduce the structural heterogeneity accurately (Camilloni et al., 2013) without excessive computational requirements. The agreement between calculated and experimental CSs was quantified by the root-mean-square deviation between the two (Figure 4—figure supplement 1). In particular, it is important not to bias the agreement beyond what can be expected based on the inherent accuracy of the CS prediction methods (we assumed that the error in the experimental CS measurement, even for the E state, is negligible in comparison). Thus, we compared the experimental CS values of the minor state with the values calculated using the EROSETTA structure as input to CamShift (Kohlhoff et al., 2009), Sparta+ (Shen and Bax, 2010) and ShiftX (Neal et al., 2003) (Figure 4—figure supplement 2). The average RMSDs for five measured nuclei (Hα, HN, N, C and Cα) are 0.2, 0.4, 2.0, 0.8 and 1.1ppm, respectively (Appendix 1—table 1), which are close to the inherent uncertainty of the CS back-calculation, indicating that the level of agreement enforced is reasonable.

To compare the results of these experimentally-biased simulations with the experimentally-unbiased simulations described above, we projected the CS-restrained MD trajectories onto either one (Figure 4) or both (Figure 4—figure supplement 3) of the Spath and Zpath variables used in the path-variable-driven simulations (PathMetaD). The distribution of conformations obtained using the E-state CSs as restraints is in good agreement with the broad free energy profile of the E-state obtained in the metadynamics simulations that did not include any experimental restraints. To ensure that this observation is not merely an artifact of both simulations using the same force field (CHARMM22*), we repeated the biased simulations using the Amber ff99SB*-ILDN force field and obtained comparable results. We also verified that the conclusions obtained are reasonably robust to other variables such as the number of replicas and the strength of restraints (Figure 4—figure supplement 4).

Figure 4 with 5 supplements see all
Conformational ensemble of the minor state as determined by CS biased, replica-averaged simulations.

We determined an ensemble of conformations corresponding to the E-state of L99A T4L using replica-averaged CSs as a bias term in our simulations. The distribution of conformations was projected onto the Spath variable (orange) and is compared to the free energy profile obtained above from the metadynamics simulations without experimental biases (black line). To ensure that the similar distribution of conformations is not an artifact of using the same force field (CHARMM22*) in both simulations, we repeated the CS-biased simulations using also the Amber ff99SB*-ILDN force field (magenta) and obtained similar results. Finally, we used the ground state CSs of a triple mutant of T4L, which was designed to sample the minor conformation (of L99A) as its major conformation, and also obtained a similar distribution along the Spath variable (cyan).

https://doi.org/10.7554/eLife.17505.009

As a final and independent test of the structural ensemble of the minor conformation of L99A we used the ground state CSs of the triple mutant (BMRB entry 17603), which corresponds structurally to the E state of L99A, as restraints in replica-averaged CS-biased simulations (Figure 4—figure supplement 5). Although not fully converged, these simulations also cover roughly the same region of conformational space when projected along Spath (Figure 4).

Thus, together our different simulations, which employ different force fields, are either unbiased or biased by experimental data, and use either dispersion-derived (L99A) or directly obtained (triple mutant) CS all provide a consistent view of the minor E-state conformation of L99A. We also note that the CS-derived ensembles of the E-state support the way we divided the G- and E-states when calculating conformational free energy differences between the two states.

Mechanisms of conformational exchange

Having validated that our simulations can provide a relatively accurate description of the structure, thermodynamics and kinetics of conformational exchange we proceeded to explore the molecular mechanism of the G-to-E transitions. We used the recently developed reconnaissance metadynamics approach (Tribello et al., 2010), that was specifically designed to enhance sampling of complicated conformational transitions and has been employed to explore the conformational dynamics of complex systems (Tribello et al., 2011; Söderhjelm et al., 2012).

We performed three independent reconnaissance metadynamics simulations of L99A starting from the G state (summarized in Appendix 1—table 1) using the same geometry-based CVs that we also used in the parallel-tempering simulations described above. We observed complete conformational transitions from the G to E state in the reconnaissance simulations in as little as tens of nanoseconds of simulations (Figure 5—figure supplement 1)— at least 1–2 orders of magnitude faster than standard metadynamics. These G-to-E and E-to-G transitions, although biased by the CVs, provide insight into the potential mechanisms of exchange. To ease comparison with the equilibrium sampling of the free energy landscape we projected these transitions onto the free energy surface F(Spath,Zpath) (Figure 5). The results reveal multiple possible routes connecting the G and E states, consistent with the multiple gullies found on the free energy surface (Figure 2). The trajectories also suggested that the G-to-E interconversion can either take place directly without passing the I0.36 state or indirectly via it.

Figure 5 with 3 supplements see all
Mechanisms of the G-E conformational exchanges explored by reconnaissance metadynamics.

Trajectories labeled as Trj1 (magenta), Trj2 (blue) and Trj3 (green and orange) are from the simulations RUN10, RUN11 and RUN12 (Appendix 1—table 1), respectively. There are multiple routes connecting the G and E states, whose interconversions can take place directly without passing the I0.36 state or indirectly via it.

https://doi.org/10.7554/eLife.17505.015

In the context of coarse-grained kinetic models the results above would suggest at least two possible mechanisms operate in parallel: GE or GI0.36E. Further inspection of the structures along these different kinetics routes (see the trajectories of other order parameters in Figure 5—figure supplement 2 and Videos 14) suggested an interesting distinction between the two. In the GI0.36E route the side chain of F114, which occupies the cavity in the E state, gets transiently exposed to solvent during the transition, whereas in the direct GE transitions F114 can rotate its side chain inside the protein core (see also the solvent accessible surface area calculation of F114 in Figure 5—figure supplement 3).

Video 1
Trajectory of the G-to-E conformational transition observed in Trj1, corresponding to the red trajectory in Figure 5.

The backbone of L99A is represented by white ribbons, Helices E, F and G are highlighted in blue, while F114 is represented by red spheres.

https://doi.org/10.7554/eLife.17505.019
Video 2
Trajectory of the G-to-E conformational transition observed in Trj2, corresponding to the blue trajectory in Figure 5.

The backbone of L99A is represented by white ribbons, Helices E, F and G are highlighted in blue, while F114 is represented by red spheres.

https://doi.org/10.7554/eLife.17505.020
Video 3
Trajectory of the G-to-E conformational transition observed in Trj3, corresponding to the green trajectory in Figure 5.

The backbone of L99A is represented by white ribbons, Helices E, F and G are highlighted in blue, while F114 is represented by red spheres.

https://doi.org/10.7554/eLife.17505.021
Video 4
Trajectory of the E-to-G conformational transition observed in Trj3, corresponding to the yellow trajectory in Figure 5.

The backbone of L99A is represented by white ribbons, Helices E, F and G are highlighted in blue, while F114 is represented by red spheres.

https://doi.org/10.7554/eLife.17505.022

A potential pathway for ligand binding and escape

As the internal cavity in L99A T4L remains buried in both the G and E states (and indeed occupied by F114 in the E state) it remains unclear how ligands access this internal cavity and how rapid binding and release is achieved. Visual inspection of our trajectories and solvent-accessible surface area analysis revealed structures with transient exposure of the internal cavity towards the solvent. The structures were mostly found in a region of conformational space that mapped onto the I0.36 basin (Figure 2), and the events of that basin mostly took place between 430 ns and 447 ns (see Video 5). Thus, we mapped these structures to the free energy surface (Figure 6—figure supplement 1) and analysed them. Overall, the structure is more similar to the G- than E-state, though is more loosely packed. The similarity to the G-state is compatible with rapid binding and position of F114 in this state.

Video 5
Movie of the calculated two-dimensional free energy landscape of L99A as a function of simulation time.

The figure shows the time evolution of the free energy surface as a function of Spath and Zpath sampled in a 667 ns PathMetaD simulation of L99A.

https://doi.org/10.7554/eLife.17505.023

We used CAVER3 (Chovancova et al., 2012) (see parameters in Appendix 1—table 4) to analyse the structures and found multiple tunnels connecting the cavity with protein surface (Figure 6—figure supplement 1 and 2). The tunnels are relatively narrow with the typical radius of the bottleneck (defined as the narrowest part of a given tunnel) between ∼1 Å − ∼2 Å. We used CAVER Analyst1.0 (Kozlikova et al., 2014) (see details in Appendix and parameters in Appendix 1—table 4) to separate the tunnels into different clusters (Figure 6—figure supplement 3 and Appendix 1—table 5) with the dominant cluster (denoted tunnel#1) having a entrance located at the groove between HF and HI. A typical representative structure of I0.36 is shown in Figure 6A. The radii along the structures in cluster #1 vary, but share an overall shape (Figure 6—figure supplement 1), and we find that the maximal bottleneck radius is 2.5 Å, the average bottleneck radius is 1.3 Å, and the average length 11.2 Å.

Figure 6 with 4 supplements see all
A transiently formed tunnel from the solvent to the cavity is a potential ligand binding pathway.

(A) We here highlight the most populated tunnel structure (tunnel#1), that has an entrance located at the groove between helix F (HF) and helix I (HI). Helices E, F and G (blue) and F114 (red) are highlighted. (B) The panel shows a typical path of benzene (magenta) escaping from the cavity of L99A, as seen in ABMD simulations, via a tunnel formed in the same region as tunnel #1 (see also Video 6).

https://doi.org/10.7554/eLife.17505.024

Interestingly, a series of structures of L99A were recently described, in which the internal cavity where filled with eight congeneric ligands of increasing size to eventually open the structure size (Merski et al., 2015). We performed a comparable tunnel analysis on those eight ligand-bound structures (PDB ID codes: 4W52 – 4W59), revealing the maximal bottleneck radius of 1.8 Å (bound with n-hexylbenzene, 4W59). Although the size of the tunnel in these X-ray structures is slightly smaller than that in I0.36 structures, the location of the tunnel exit is consistent with the dominant tunnel#1 in I0.36 (Figure 6—figure supplement 3). We note, however, that the tunnels observed in our simulation and in the ligand-induced cavity-open X-ray structure (4W59), are too narrow to allow for unhindered passage of e.g. benzene with its a van der Waals’ width of 3.5 Å (Eriksson et al., 1992a). Thus, we speculate that the transient exposure in I0.36 might serve as a possible starting point for ligand (un)binding, which would induce (Koshland, 1958; López et al., 2013; Wang et al., 2012) further the opening of the tunnel.

As an initial step towards characterizing the mechanism of ligand binding and escape we used adiabatic biased molecular dynamics (ABMD) simulations (Marchi and Ballone, 1999; Paci and Karplus, 1999) to study the mechanism of how benzene escapes the internal cavity (see Appendix for details). In ABMD the system is perturbed by a ‘ratcheting potential’, which acts to ‘select’ spontaneous fluctuations towards the ligand-free state. In particular, the biasing potential is zero when the reaction coordinate (here chosen to be the RMSD of the ligand to the cavity-bound state) increases, but provides a penalty for fluctuations that brings the ligand closer to the cavity. In this way, we were able to observe multiple unbinding events in simulations despite the long lifetime (1.2 ms) of the ligand in the cavity. Most of trajectories (15 of the 20 events observed) reveal that benzene escapes from the cavity by following tunnel #1 (Figure 6—figure supplement 4 and Appendix 1—table 6). A typical unbinding path is shown in the right panel of Figure 6 (see also Video 6). Because the ABMD introduces a bias to speed up ligand escape, we ensured that the observed pathway was the same at two different values of the biasing force constants (Figure 6—figure supplement 4 and Appendix 1—table 6). Future work will be aimed to perform a more quantitative analysis of the ligand binding and unbinding kinetics.

Video 6
A typical trajectory of the benzene escaping from the buried cavity of L99A via tunnel #1 revealed by ABMD simulations.

The backbone of L99A is represented by white ribbons, Helices E, F and G are highlighted in blue, while F114 and benzene are represented by spheres in red and magenta, respectively.

https://doi.org/10.7554/eLife.17505.029

Conclusions

The ability to change shape is an essential part of the function of many proteins, but it remains difficult to characterize alternative conformations that are only transiently and sparsely populated. We have studied the L99A variant of T4L as a model system that displays a complicated set of dynamical processes which have been characterized in substantial detail. Our results show that modern simulation methods are able to provide insight into such processes, paving the way for future studies for systems that are more difficult to study experimentally.

Using a novel method for defining an initial reference path between two conformations, we were able to sample the free energy landscape described by an accurate molecular force field. In accordance with experiments, the simulations revealed two distinct free energy basins that correspond to the major and minor states found by NMR. Quantification of the free energy difference between the two states demonstrated that the force field is able to describe conformational free energies to an accuracy of about 1 kcal mol−1. This high accuracy is corroborated by previous studies of a different protein, Cyclophilin A, where we also calculated conformational free energies and compared to relaxation dispersion experiments and found very good agreement. For both proteins we were also able to capture and quantify the effect that point mutations have on the equilibrium between the two states, and also here found good agreement with experiments. We note, however, that comparable simulations of the L99A/G113A mutant did not reach convergence.

Moving a step further, we here also calculated the kinetics of conformational exchange using a recently developed metadynamics method. For both the L99A variant and a population-inverting triple mutant we find that the calculated reaction rates are in remarkably good agreement with experiments. The ability to calculate both forward and backward rates provided us with the opportunity to obtain an independent estimate the calculated free energy difference. The finding that the free energy differences estimated in this way (for both L99A and the triple mutant) are close to those estimated from the free energy landscape provides an important validation of both approaches, and we suggest that, when possible, such calculations could be used to supplement conventional free energy estimates.

The free-energy landscape suggested that the E state is relatively broad and contains a wider range of conformations. To validate this observation, we used the same chemical shift information as was used as input to Rosetta and performed replica-averaged CS-restrained simulations. The resulting ensemble demonstrates that the experiments and force field, when used jointly, indeed are compatible with a broader E state. Thus, we suggest that the EROSETTA structure and CS-restrained ensemble jointly describe the structure and dynamics of the E state.

While NMR experiments, in favourable cases, can be used to determine the structure, thermodynamics and kinetics of conformational exchange, a detailed description mechanism of interconversion remains very difficult to probe by experiments. We explored potential mechanisms of conformational exchange between the two states, finding at least two distinct routes. One route involved a direct transition with the central F114 entering the cavity within the protein, whereas a different possible mechanism involves transient partial-loosening of the protein. In both cases, the mechanism differs from the reference path that we used as a guide to map the free energy landscape, demonstrating that high accuracy of the initial guess for a pathway is not absolutely required in the metadynamics simulations, suggesting also the more general applicability of the approach.

Finally, we observed a set of conformations with a transiently opened tunnel that leads from the exterior of the protein to the internal cavity, that is similar to a recently discovered path that is exposed when the cavity is filled by ligands of increasing size. The fact that such a tunnel can be explored even in the absence of ligands suggests that intrinsic protein motions may play an important role in ligand binding, and indeed we observed this path to be dominant in simulations of ligand unbinding.

In total, we present a global view of the many, sometimes coupled, dynamical processes present in a protein. Comparison with a range of experimental observations suggests that the simulations provide a relatively accurate description of the protein, demonstrating how NMR experiments can be used to benchmark quantitatively the ability of simulations to study conformational exchange. We envisage that future studies of this kind, also when less is known about the structure of the alternative states, will help pave the way for using simulations to study the structural dynamics of proteins and how this relates to function.

Materials and methods

System preparation

Request a detailed protocol

Our simulations were initiated in the experimentally determined structures of the ground state of L99A (GXray; PDB ID code 3DMV) or minor, E state (EROSETTA; 2LCB). The structure of the ground state of the L99A, G113A, R119P triple mutant, corresponding to the E state of L99A was taken from PDB entry 2LC9 (GROSETTATriple). Details can be found in the Appendix.

Initial reaction path

Request a detailed protocol

Taking GXray and EROSETTA as the models of the initial and final structures, we calculated an initial reaction path between them with the MOIL software (Elber et al., 1995), which has been used to explore the mechanism of conformational change of proteins (Wang et al., 2011). Further details can be found in the Appendix and in refs. (Majek et al., 2008; Wang et al., 2011).

Path CV driven metadynamics simulations with adaptive hills

Request a detailed protocol

The adaptive-hill version of metadynamics updates the Gaussian width on the fly according to the local properties of the underlying free-energy surface on the basis of local diffusivity of the CVs or the local geometrical properties. Here, we used the former strategy. Simulation were performed using Gromacs4.6 (Pronk et al., 2013) with the PLUMED2.1 plugin (Tribello et al., 2014). See parameter details in Appendix 1—table 1.

Replica-averaged CS-restrained simulations

Request a detailed protocol

We performed replica-averaged CS restrained MD simulations by using GPU version of Gromacs5 with the PLUMED2.1 and ALMOST2.1 (Fu et al., 2014) plugins. Equilibrated structures of EROSETTA and GROSETTATriple were used as the starting conformations. CS data of EROSETTA and GROSETTATriple were obtained from the BMRB database (Ulrich et al., 2008) as entries 17604 and 17603, respectively.

Reconnaissance metadynamics simulations

Request a detailed protocol

Reconnaissance metadynamics (Tribello et al., 2010) uses a combination of a machine learning technique to automatically identify the locations of free energy minima by periodically clustering the trajectory and a dimensional reduction technique that can reduce the landscape complexity. We performed several reconnaissance metadynamics simulations with different combinations of CVs starting from GXray using Gromacs4.5.5 with PLUMED1.3 plugin. See parameter details in Appendix 1—table 1.

Calculating kinetics using infrequent metadynamics

Request a detailed protocol

The key idea of infrequent metadynamics is to bias the system with a frequency slower than the barrier crossing time but faster than the slow intra-basin relaxation time, so that the transition state region has a low risk of being substantially biased. As the first transition times should obey Poisson statistics, the reliability of the kinetics estimated from InMetaD can be assessed by a statistical analysis based on the Kolmogorov-Smirnov (KS) test (Salvalaglio et al., 2014). See parameter details on Appendix and Appendix 1—table 1.

Appendix

Molecular modeling and system preparation

We used the crystal structure of T4L L99A (PDB ID code 3DMV) as starting point for simulations of the G state of L99A. For the E state of L99A and the G state of the L99A, G113A, R119P triple-mutant we used the CS-ROSETTA structures with PDB ID code 2LCB and 2LC9, respectively.

Each protein was solvated in a dodecahedral box of TIP3P water molecules with periodic boundary conditions. The protein-solvent box had a distance of 10 Å from the solute to the box boundary in each dimension, which results in approximately 10,000 water molecules and more than 32,000 atoms. Chloride counter-ions were included to neutralize the overall electric charge of the system. We used the CHARMM22* force field (Piana et al., 2011) for most of our simulations, but also used the Amber ff99SB*-ILDN (Hornak et al., 2006; Best and Hummer, 2009; Lindorff-Larsen et al., 2010) for some simulations to examine the dependency of the results on the choice of force field.

The van der Waals interactions were smoothly shifted to zero between 0.8 and 1.0 nm, and the long-range electrostatic interactions were calculated by means of the particle mesh Ewald (PME) algorithm with a 0.12 nm mesh spacing combined with a switch function for the direct space between 0.8 and 1.0 nm. The bonds involving hydrogen atoms were constrained using the LINCS algorithm. We employed the V-rescale thermostat (Bussi et al., 2007) to control the temperatureand simulated the system in the canonical ensemble.

Reference transition path and path collective variables

We used path collective variables both to enhance sampling in path driven metadynamics (see below) as well as to represent the conformational landscape sampled by other means. Path collective variables have previously been shown to be very useful in finding free energy channels connecting two metastable states, and also able to construct the global free energy surfaces even far away from the initial path (Branduardi et al., 2007). A reference path is defined by a set of conformations along the path, and the progress along this path can be described mathematically as:

Spath(X)=i=1NieλMi(X)i=1NeλMi(X)

Here X are the coordinates of the instantaneous protein conformer sampled by MD simulations, N is the number of frames used to describe the reference path (often dependent on the length scale of the conformational transition process), Mi(X) is the mean-square deviation (after optimal alignment) of a subset of the atoms from the reference structure of i th frame, and λ is a smoothing parameter whose value should roughly be proportional to the inverse of the average mean square displacement between two successive frames along the reference path. With this definition, Spath quantifies how far the instantaneous conformer, X, is from the reactant state and the product state, thus monitoring the progress of the system along the conformational transition channel.

Using Spath as the sole CV would assume that the initial reference path contains a sufficient description of the important degrees of freedom between the two states. It is, however, rarely possible to guess such a path because determining the actual pathway taken is a goal of the simulation. Thus, Spath is supplemented by a second CV, Zpath, which measures the deviation away from the structures on the reference path. I.e. if Spath quantifies the progress along the path, Zpath measures the distance away from the reference path:

Zpath(X)=-1λlni=1Ne-λMi(X)

The combination of Spath and Zpath thus maps the entire conformational landscape to a two-dimensional projection, which can also be thought of as a tube connecting the two end states, and where S measures the progress along the tube and Z the width of the tube. The usefulness of the path CVs is, however, dependent on the quality of reference path, which is determined amongst other things by two factors: (1) the relative accuracy of the reference path and (2) how uniform reference structures are distributed along the path. Because of the explosion in number of possible conformations as one progresses along Zpath, simulations are mostly enhanced when Spath provides a relatively good description of the pathway taken. Further, if reference structures are only placed sparsely along the path, one looses resolution of the free energy surface and also decreases the ability to enhance sampling.

To obtain a good reference path, without knowing beforehand the mechanism of conversion, we here used a method to construct snapshots along a possible initial path. Taking GXray and EROSETTA as initial and final structures, we calculated the optimal reaction paths between them with the MOIL software (Elber et al., 1995) which has previously been used to explore the mechanism of conformational change of proteins (Wang et al., 2011) . After minimizing endpoint structures, we employed the minimum-energy-path self-penalty walk (SPW) (Czerminski and Elber, 1990) functional embedded in the CHMIN module to obtain an initial guess for the conformational transition path. This path was subsequently optimized in the SDP (steepest descent path) module by minimizing the target function T consisting of two terms S and C. S is an action function that provides approximate most probable Brownian trajectories connecting the reactant and product states, while C is a restraint function aimed to distribute frames approximately uniformly along the path. They can be expressed by:

S=i=1N1Hs+(Uxi)2|xi+1xi|C=λi(Δli,i+1< Δl >)2

where N is the number of frames along the reference path, xi is the entire vector of conformational coordinates of frame i, U is a potential energy as a function of the mass-weighted coordinate vector, Hs is a constant with an arbitrary positive value, which can be tuned to generate the optimal paths with different thermal energies. Δli,i+1=M1/2|xi-xi+1| is the arc-length between consecutive frames. λ controls the strength of the restraint function C.

The SDP or minimum energy path is the limiting path in which the action S is optimized with Hs0. An important advantage of an SDP is that it is capable of giving a good guess of the minimal energy path which can reflect the major mechanism, only with inexpensive computation. Further details can be found in Refs. (Wang et al., 2011; Majek et al., 2008).

The SDP was approximated as 31 discrete conformations. Most regions of the GXray and EROSETTA structures are very similar, with the exception of the cavity-related atoms whose movement determines the conformational transitions between the G and E states. To minimize computations, we used only a subset of heavy atoms around the cavity from amino acid residues 99 to 126 to define Spath and Zpath, resulting a ‘light version’ of the reference path which included only 212 atoms. We used the Cα atoms of the C-terminal domain to align the molecule. It is important to note that by focusing only on atoms surrounding the cavity in the calculation of the path-variables we only enhance sampling of the conformational changes relating to the cavity. We used λ=56.0 based on the consideration that smooth change of the function of Spath can be achieved when eλ<Mi(X)>0.1 (Tribello et al., 2014).

We defined Spath and Zpath using the SDP as described above. The equidistant requirement of the path is satisfied by the penalty function as is evident from the RMSD matrix for path frames which has a gullwing shape (Figure 2—figure supplement 1, indicating that each frame is closest to its neighbor and more different from all other reference frames.

Metadynamics simulations

Metadynamics discourages the system from sampling already visited conformational regions by continuously adding an external history-dependent repulsive potential at the present value of the reaction coordinates or CVs, which are assumed to include the slowly varying degrees of freedom and thus describe the main features of the dynamics (Laio and Parrinello, 2002). The biasing potential in metadynamics results in an artificial (enhanced) dynamics but makes it possible to reconstruct the free energy surfaces by removing the bias introduced. The bias is typically added as Gaussians at regular time intervals, τG, and is given by:

VG(S,t)=k=1|t/τG|ωkδ(S(t),S(kτG))

Here S denotes the CVs, k is the index of the individual Gaussians, ωk is the height of the k’th Gaussian, and δ(S(t),S(kτG)) is a short-ranged kernel function of the CVs:

δ(S(t),S(kτG))=ej=1n-|sj(t)-sj(kτG)|22σkj2

where n is the number of CV, j is the index of a CV, and σkj and sj(kτG) are the width and the position of the Gaussian hills, respectively.

We here used a range of different metadynamics approaches to determine the free energy landscape, and the mechanism and kinetics of conformational exchange (see below).

Well-tempered metadynamics

In the well-tempered version of metadynamics, the height of the individual Gaussians, ωk, is decreased as the total bias accumulates over time, in order to improve the convergence of the free energy:

ωk=ω0e-1γ-1VG(S,kτG)kBT

Here ω0 is the initial height, γ=(T+ΔT)/T is referred as the bias factor, which can be tuned to control the speed of convergence and diminish the time spent in lesser-relevant, high-energy states. Thus, the quantity T+ΔT is often referred as the fictitious CV temperature.

Adaptive-width metadynamics

In contrast to standard metadynamics in which the width of the Gaussians is constant, the adaptive-width version of metadynamics updates the Gaussian width σkj on the fly according to local properties of the underlying free-energy surface on the basis of local diffusivity of the CVs or the local geometrical properties (Branduardi et al., 2012).

In the region of conformational space near the endpoints of the path CVs many conformations are compressed on similar CV values, leading to high-density but low-fluctuation boundaries. It is apparent that the use of a fixed width might give an inaccurate estimation of the free energy profile in the boundaries where the free energy basins of reactant and product states are located, also makes it more difficult for the simulations to converge. Therefore, the feature of shape adaptive of Gaussian potential is particularly helpful for the case of using path CVs that have significant boundary effects.

Metadynamics with path variables (PathMetaD)

We sampled the free energy landscape along Spath and Zpath, as defined above, using adaptive-width metadynamics, which resulted in a finer resolution and faster convergence of the free energy landscape, in particular near the path boundaries, than standard metadynamics. The production simulations were performed at 298K in well-tempered ensemble (Appendix 1—table 1: RUN1 and RUN2).

Reconnaissance metadynamics

To explore the mechanism of conformational exchange we used reconnaissance metadynamics (Tribello et al., 2010). This is a ‘self-learning’ approach which combines of a machine learning technique to automatically identify the locations of free energy minima by periodically clustering the trajectory and a dimensional reduction technique that can reduce the complex locations to a locally-defined one-dimensional CV by using information collected during the clustering. It has previously been shown that reconnaissance metadynamics makes it possible to determine a path from a large set of input collective variables (Söderhjelm et al., 2012; Tribello et al., 2011).

Infrequent Metadynamics (InMetaD)

We used the recently described ‘infrequent metadynamics’ (InMetaD) to obtain the rates of the conformational exchange process (Tiwary and Parrinello, 2013). In standard metadynamics simulations it is very difficult to obtain kinetic properties because the biasing potential is added both to the free energy basins as well as the barriers that separate them. While it is potentially possible to determine the rates from the height of the free energy barriers, this requires both that the CVs used represent the entire set of slowly varying degrees of freedom, and also a good estimate of the pre-exponential factor to convert barrier height to a rate.

The key idea in InMetaD to circumvent these problems is to attempt to add the bias to the system more slowly than the barrier crossing time but faster than the slow inter-basin relaxation time, so that the transition state region has a lower risk of being biased, and therefore the transitions are less affected. By filling up a free energy basin by a known amount it is possible to determine how much the barrier has been decreased, and hence remove this bias from the rates determined. Thus, as described in more detail below, the approach works by performing a large number of individual simulations to obtain first passage times between the individual basins, which are then corrected by the known enhancement factors to obtain estimates of the unbiased rates. This method has been successfully used to reproduce the kinetics of conformational change of alanine dipeptide (Tiwary and Parrinello, 2013) unbinding of the inhibitor benzamidine from trypsin (Tiwary et al., 2015a) and slow unbinding of a simple hydrophobic cavity-ligand model system (Tiwary et al., 2015b).

In these simulations we used a deposition frequency of 80 or 100 ps (see parameters in Appendix 1—table 1), a value much lower than the deposition frequency of 1 ps used in the PathMetaD simulations described above. In this way we lower the risk of substantially corrupting the transition state region. In addition, a tight upper wall potential on Zpath=0.10 nm2 is used to confine the sampling based on our converged free energy surface which shows the conformational change mostly occurs within this region.

With these parameters we collected dozens of trajectories that have a state-to-state transition in the G-to-E and E-to-G directions. The passage times observed in each of these were then corrected for the metadynamics bias as follows.

First, we calculate the acceleration factor α from:

α=τ/τM=<eV(s,t)/kT>M

where the anguluar brackets denote an average over a metadynamics run before the first transition, and V(s,t) is the metadynamics time-depedent bias. The evolution of the acceleration factor α(t) can be expressed by:

α(t)=(1/t)0tdteV(s,t)/kT

Then the observed passage time, t, is reweighted by:

τtrue=α(t)*t=0tdteV(s,t)/kT

In principle, the transition time should be a Poisson-distributed random variable, and its mean, μ, standard deviation σ and median tm/ln2 all should be equal to each other. In practice, however, they are somewhat sensitive to insufficient sampling (Salvalaglio et al., 2014). So rather than simply calculating averages of the individual times, we estimated the average rate and transition time τ from a fit of the empirical cumulative distribution function (ECDF) with the theoretical cumulative distribution function (TCDF):

TCDF=1-e-tτ

It has previously been shown that τ estimated in this way converges more quickly than the simple average, μ. This is also consistent with our observation, and we find that 10–15 samples appear sufficient to get a reasonably accurate estimation of the transition time. We used a bootstrap approach to estimate the errors.

To examine whether the observed times indeed follow the expected Poisson distribution we used a Kolmogorov-Smirnov (KS) test to obtain a p-value that quantifies the similarity between the empirical and theoretical distributions. Traditionally, a threshold value typically of 0.05 or 0.01 (the significance level of the test) is used to judge if the theoretical (TCDF) and empirical (ECDF) distributions are in agreement. If the p-value is equal to or larger than the threshold value, it suggests that the estimated transition time is quite reliable. If (a) the transition regions were perturbed significantly with infrequent biasing or (b) there are hidden unidentified timescales at play (e.g. the CVs do not capture the slow degrees of freedom) the KS test for time-homogeneous Poisson statistics would fail.

CS-restrained replica-averaged simulation

The simulation methods described above constitute different ways of exploring the thermodynamics (PathMetaD), kinetics (InMetaD) and mechanism (Reconnaissance metadynamics) of conformational exchange. In all of these simulations, sampling is determined by the molecular energy function (force field), and the experimental information on T4L is used only in the construction of the path variables. When additional experimental information is available, one may introduce an additional energy term so as to bias the simulations to be in agreement with this information (Robustelli et al., 2010). As the experimental values are ensemble averages we apply these restraints only to averages calculated over a number of ‘replicas’ that are simulated in parallel. In this way, the information from the experimental data is incorporated into the simulation as a perturbation following the maximum entropy principle (Pitera and Chodera, 2012; Roux and Weare, 2013; Cavalli et al., 2013; Boomsma et al., 2014).

We used this approach to obtain conformational ensembles that include not only information from the molecular force field, but also experimental NMR chemical shifts (CS). In particular, we used either the chemical shifts of the E state of L99A obtained from the analysis of the CPMG experiments, or the native state chemical shifts of a triple mutant that populates the same state as its ground state (Bouvignies et al., 2011). The CS restraints were imposed by adding a pseudo-energy term (ECS) to a standard molecular-mechanics force field (EFF).

ECS=ϵCSi=1Nj=16(δijExp-1Mk=1MδijkSim)2

Here ϵCS is strength of the CS restraints, i indicates the residue number (total of N), j indicates each of the six backbone atoms whose chemical shifts were used (Cα, Cβ, C, Hα, HN and N), k is an index for the total of M replicas, and δExp and δSim are the experimental and simulated CSs, respectively. The latter quantity, δSim, was calculated by CamShift (Fu et al., 2014) as a plugin of PLUMED. The CS values of Pro, Gly, Asp, Glu, His and terminal residues were not included because the accuracy in their predictions are too low to contain sufficient information in this approach (Fu et al., 2014). We set ϵCS=24 kJ mol−1 ppm−2 and used M=4 replicas. In principle, the number of replicas is a free parameter that should be set as large as possible when the experimental data and the method for calculating it is noise free (Boomsma et al., 2014). In practice, one uses a finite set of replicas and it has been shown that M=4 replicas is sufficient to capture the dynamics accurately (Camilloni et al., 2013).

We performed replica-averaged CS restrained MD simulations using GROMACS4.6 and the PLUMED2.1 plugin at 298 K. The equilibrated structures of EROSETTA and GROSETTATriple were used as the starting configuration (of each of the four replicas) in the CS-restrained simulations of L99A and the L99A,G113A,R119P triple mutant, respectively. The CS data for EROSETTA and GROSETTATriple were obtained from the Biological Magnetic Resonance Bank (BMRB) database (Ulrich et al., 2008) with entries 17604 and 17603, respectively.

PT-WT-MetaD failed to get the converged free energy landscape

In practice, the choice of CVs plays a fundamental role in determining the accuracy, convergence and efficiency of metadynamics simulations. If an important CV is missing, the exploration of the free energy landscape will be difficult due to hysteresis. Finding a minimal set of CVs that include all important degrees of freedom is a highly nontrivial task and one often has to proceed by several rounds of trial simulations.

At the beginning, we followed a strategy which had previously been successfully used in the exploration of protein conformational transitions (Sutto and Gervasio, 2013; Papaleo et al., 2014) to design a set of CVs on the basis of static structural comparison between GXray and EROSETTA. In particular, by comparing these two structures we defined several CVs that described structural differences by individual dihedral angles and hydrogen bonds, as well as dihedral correlation and coordination number (state-specific contact map) (Bonomi et al., 2009) (summarized in Appendix 1—table 1). We combined these in a multiple-replica, parallel tempering approach in the well-tempered ensemble (PT-WT-metaD) (Bonomi and Parrinello, 2010) to further enhance the sampling. In PT-WT-metaD, the energy fluctuations are enlarged by using energy as a biased CV but the average energy is the same as the canonical ensemble, allowing the use of a larger spacing between temperatures and a much fewer number of replicas than normal PT simulations (Barducci et al., 2015). Coordinate exchange with high temperature replicas can enhance the sampling of all the degrees of freedom, even those not included in the biased CVs, and one may include a ‘neutral’ replica (without energy bias, at 298 K). We performed a series of simulations with different combinations of CVs starting from the G state of L99A (Appendix 1—table 2). However, unfortunately, we only observed partial G-to-E transitions, even in a relatively long trajectory of about 1 μs for each replica. This negative result suggested that these manually chosen CVs did not contain all the necessary slow degrees of freedom.

Tunnel analysis

We used CAVER3 (Chovancova et al., 2012) to analyse the structures and CAVER Analyst1.0 (Kozlikova et al., 2014)(http://www.caver.cz/, see also parameters in Appendix 1—table 4) to separate the tunnels into different clusters. CAVER Analyst is a standalone program based on CAVER3.0 algorithm (Chovancova et al., 2012). The settings for the tunnel calculations can be set through the Tunnel Computation window, while the advanced parameters can be set in the Tunnel Advanced Settings window. We used the center of mass of the cavity-related region (residue 93-124) as the position of the starting point. Average-link hierarchical clustering algorithm is performed to build a tree hierarchy of tunnel axes based on their pairwise distances. The size of the resulting clusters is dependent on the Clustering threshold parameter which specifies the level of detail at which the tree hierarchy of tunnel clusters will be cut. We used the default value so that the tree hierarchy of tunnel clusters is cut at the value of 3.5.

Adiabatic bias molecular dynamics

Adiabatic biased molecular dynamics (ABMD) (Marchi and Ballone, 1999; Paci and Karplus, 1999) is an algorithm developed to accelerate the transition from the reactant state to the productive state, here corresponding to the ligand bound state and ligand-free state, respectively. In ABMD the system is perturbed by a ‘ratcheting potential’, which acts to ‘select’ spontaneous fluctuations towards the ligand-free state. The ratcheting potential is implemented in PLUMED2.2 as

V(ρ(t))={0.5K(ρ(t)ρm(t))2,ρ(t)>ρm(t)0,ρ(t)ρm(t)

where

ρ(t)=(S(t)-Starget)2

and

ρm(t)=min0τtρ(τ)+η(t)

K is the force constant, S(t) is the instantaneous CV value, Starget is the target value of the CV and η(t) is an additional white noise acting on the minimum position of ρ(t). Here, we used the RMSD of the ligand to the cavity-bound state as the CV, and set Starget=4.0 nm and K=20 kJ · mol−1 · nm−2 or 50 kJ · mol−1 · nm−2 to check the dependency of the force constant chosen. The biasing potential is zero when the CV increases but provides a penalty when the CV decreases. In this way, we were able to observe multiple unbinding events in simulations despite the long lifetime (1.2 ms) of the ligand in the cavity.

Appendix 1—figure 1
Two representative InMetaD trajectories of L99A with G to E transitions.

The time point, t′, for the first transition from G to E is identified when the system evolves into conformational region of Spath > 0.55 and Zpath < 0.01. We then calculate the unbiased passage time by multiplying t′ by the corresponding accelerate factor α(t′). Upper panels show the evolution of the reweighted time as a function of metadynamics time. The kinks usually indicate a possible barrier crossing event. Middle panels show the trajectories starting from the G state and crossing the barrier towards the E state. Lower panels show the biasing landscape reconstructed from the deposited Gaussian potential, which can be used to check the extent to which the transition state regions are affected by deposited bias potential.

https://doi.org/10.7554/eLife.17505.030
Appendix 1—figure 2
Two representive InMetaD trajectories of L99A with E to G transitions of L99A.

First transition time for G to E transition is identified when the system evolves into conformational region of Spath < 0.28 and Zpath < −0.01.

https://doi.org/10.7554/eLife.17505.031
Appendix 1—figure 3
Characteristic transition times between G and E states of L99A.

The error bars represent the standard deviation of τ obtained from a bootstrap analysis.

https://doi.org/10.7554/eLife.17505.032
Appendix 1—figure 4
Characteristic transition times between G and E states of the triple mutant.

The figure shows the characteristic transition time τGE (left panel) and τEG (right panel) of the triple mutant as a function of the size of a subsample of transition times randomly extracted from the main complete sample. The error bars represent the standard deviation of characteristic transition times obtained by a bootstrap analysis. The calculated and experimental values of the transition times are shown in blue and red font, respectively.

https://doi.org/10.7554/eLife.17505.033
Appendix 1—figure 5
Poisson fit analysis for G to E transitions and E to G transitions of L99A.

We show the ECDF (the empirical cumulative distribution function) and TCDF (the theoretical cumulative distribution function) in black and blue lines, respectively. The respective p-values are reasonably, albeit not perfectly, well above the statistical threshold of 0.05 or 0.01, indicating the kinetics is not substantially modified by the deposited bias potential in InMetaD. Error bars are the standard deviation obtained by a bootstrap analysis.

https://doi.org/10.7554/eLife.17505.034
Appendix 1—figure 6
Poisson fit analysis for G to E transitions and E to G transitions of the triple mutant.

The figure shows the p-values of the Poisson fit analysis of GE (A) and EG (B) transition times as a function of the size of a subsample of transition times randomly extracted from the main complete sample.

https://doi.org/10.7554/eLife.17505.035
Appendix 1—table 1

Simulation details.

https://doi.org/10.7554/eLife.17505.036
Methodlabelsysteminitlengthforce fieldCVsparametersT (K)software version
PathMetaDRUN1L99AG667 nsCHARMM22*Spath,Zpathγ=20, τD=1 ps,τG=1 ps§, ω0=1.5298PLUMED2.1, GMX4.6
PathMetaDRUN2L99A, G113A, R119PG961 nsCHARMM22*Spath,Zpathγ=20, τD=1 ps, τG=1ps, ω0=1.5298PLUMED2.1, GMX5.0
CS-restrained MDRUN3L99AE252 nsCHARMM22*N#=4, ϵCS=24**298PLUMED2.1, ALMOST2.1, GMX5
CS-restrained MDRUN4L99AE233 nsCHARMM22*N=2, ϵCS=24298PLUMED2.1, ALMOST2.1, GMX5
CS-restrained MDRUN5L99AE221 nsCHARMM22*N=2, ϵCS=12298PLUMED2.1, ALMOST2.1, GMX5
CS-restrained MDRUN6L99AE125 nsAMBER99sb*ILDNN=4, ϵCS=24298PLUMED2.1, ALMOST2.1, GMX5
CS-restrained MDRUN7L99AE190 nsAMBER99sb*ILDNN=2, ϵCS=12298PLUMED2.1, ALMOST2.1, GMX5
CS-restrained MDRUN8L99A, G113A, R119PG204 nsCHARMM22*N=4, ϵCS=24298PLUMED2.1, ALMOST2.1, GMX5
CS-restrained MDRUN9L99A, G113A, R119PG200 nsCHARMM22*N=2, ϵCS=24298PLUMED2.1, ALMOST2.1, GMX5
Reconnaissance MetaDRUN10L99AG120 nsCHARMM22*2,3,4,5ΔT=10000††, Δt=250298PLUMED1.3, GMX4.5
Reconnaissance MetaDRUN11L99AG85 nsCHARMM22*2,3,4,510000, 250298PLUMED1.3, GMX4.5
Reconnaissance MetaDRUN12L99AG41 nsCHARMM22*2,3,4,5,7,8100000, 500298PLUMED1.3, GMX4.5
PT-WT-MetaDRUN13L99AG961 nsCHARMM22*1,2,3,7,8γ=20, ω0=0.5297 298‡‡ 303 308 316 325 341 350 362PLUMED 1.3, GMX4.5
PT-WT-MetaDRUN14L99AG404 nsCHARMM22*1,2,3,6,7γ=20, ω0=0.5297 298 303 308 316 325 333 341 350PLUMED 1.3, GMX4.5
PT-WT-MetaDRUN15L99AG201 nsCHARMM22*1,2,3,6γ=20, ω0=0.5297 298 303 308 316 325 333 341 350PLUMED 1.3, GMX4.5
PT-WT-MetaDRUN16L99AG511 nsCHARMM22*1,2,3,4,5γ=5, ω0=0.5295 298 310 325 341 358 376PLUMED 1.3, GMX4.5
PT-WT-MetaDRUN17L99AG383 nsCHARMM22*1,2,3,4,5γ=20, ω0=0.5298 305 313 322 332 337 343PLUMED 1.3, GMX4.5
PT-WT-MetaDRUN18L99AE365 nsCHARMM22*1γ=20, ω0=0.5298 302 306 311 317 323 330 338PLUMED 1.3, GMX4.5
plain MDRUN19L99AG400 nsCHARMM22*298GMX5.0
plain MDRUN20L99AE400 nsCHARMM22*298GMX5.0
plain MDRUN21L99AG400 nsAMBER99sb*ILDN298GMX5.0
plain MDRUN22L99AE400 nsAMBER99sb*ILDN298GMX5.0
plain MDRUN23L99A,G113A,R119PG400 nsCHARMM22*298GMX5.0
plain MDRUN24L99A,G113A,R119PE400 nsCHARMM22*298GMX5.0
plain MDRUN25L99A,G113A,R119PG400 nsAMBER99sb*ILDN298GMX5.0
plain MDRUN26L99A,G113A,R119PE400 nsAMBER99sb*ILDN298GMX5.0
InMetaDRUN27-68§§L99AGCHARMM22*Spath,Zpathγ=20, τG=80 ps, ω0=1.0, σS=0.016¶¶, σZ=0.002##298PLUMED2.1, GMX5
InMetaDRUN69-104***L99AECHARMM22*Spath,Zpathγ=20, τG=80 ps, ω0=1.0, σS=0.016, σZ=0.002298PLUMED2.1, GMX5
InMetaDRUN105-119L99A,G113A,R119PGCHARMM22*Spath,Zpathγ=15, τG=100ps, ω0=0.8, σS=0.016¶¶, σZ=0.002##298PLUMED2.2, GMX5.1.2
InMetaDRUN120-134L99A,G113A,R119PECHARMM22*Spath,Zpathγ=15, τG=100 ps, ω0=0.8, σS=0.016¶¶, σZ=0.002##298PLUMED2.2, GMX5.1.2
ABMDRUN135-154L99A-BenzeneboundCHARMM22*RMSDbenzeneK=50 KJ · mol−1 · nm−2, Starget=4.0 nm298PLUMED2.2, GMX5.1.2
ABMDRUN155-174L99A-BenzeneboundCHARMM22*RMSDbenzeneK=20 KJ · mol−1 · nm−2, Starget=4.0 nm298PLUMED2.2, GMX5.1.2
  1. γ is the bias factor.

  2. τD is the characteristic decay time used for the dynamically-adapted Gaussian potential.

  3. §τG is the deposition frequency of Gaussian potential.

  4. ω0 is the height of the deposited Gaussian potential (in KJ · mol−1).

  5. # N is the number of replicas.

  6. **ϵCS is the strength of chemical shift restraints (in KJ · mol−1 · ppm−2).

  7. ††ΔT means RUN_FREQ and Δt means STORE_FREQ. Other parameters: BASIN_TOL=0.2, EXPAND_PARAM=0.3, INITIAL_SIZE=3.0.

  8. ‡‡ Replica at 298 K is the neutral replica without energy bias.

  9. §§42 trajectories collected for G-to-E transitions.

  10. ¶¶σS is the Gaussian width of Spath.

  11. ##σZ is the Gaussian width of Zpath (in nm2).

  12. ***36 trajectories collected for E-to-G transitions.

Appendix 1—table 2

Definition of collective variables.

https://doi.org/10.7554/eLife.17505.037
CVdefinitionsparameterspurpose
1total energybin=500enhance energy fluctuations
2dihedral angle of Cα atoms of consecutive residues F104-Q105-M106-G107σ=0.1
3dihedral angle of Cα atoms of consecutive residues G113-F114-T115-N116σ=0.1
4QG, distance in contact map space to the GXray structureσ=0.5
5QE, distance in contact map space to the EROSETTA structureσ=0.5
6distance between QG and QEσ=0.5
7number of backbone hydrogen bonds formed between M102 and G107σ=0.1
8dihedral correlation between the Cα dihedral angles of consecutive residues in segment N101-G107σ=0.1
9global RMSD to the whole proteinwall potentialavoid sampling unfolding space
Appendix 1—table 3

Average root-mean-square deviation (<RMSD> in units of ppm) between experimental CSs and those from the CS-restrained replica-averaged simulations.

https://doi.org/10.7554/eLife.17505.038
NucleusRUN3RUN4RUN5RUN6RUN7RUN8RUN9
C0.8330.6550.7760.8540.7930.9070.727
Cα1.0550.8790.9291.0650.9401.1030.894
N1.9661.7071.7711.9671.7802.0111.828
HN0.3790.2750.2910.3680.2840.4140.286
HA0.2320.1830.1860.2420.1820.2460.183
Appendix 1—table 4

Parameter set used in tunnel analysis using CAVER3.0 (Chovancova et al., 2012) and CAVER Analyst1.0 (Kozlikova et al., 2014).

https://doi.org/10.7554/eLife.17505.039
Minimum probe radius0.9 Å
Shell depth4
Shell radius3
Clustering threshold3.5
Starting point optimization
Maximum distance3 Å
Desired radius5 Å
Appendix 1—table 5

Clustering analysis of tunnels (top three listed).

https://doi.org/10.7554/eLife.17505.040
IndexPopulationMaximal bottleneck radius (Å)Average bottleneck radius (Å)
#127%2.51.3
#220%1.41.0
#315%1.31.0
Appendix 1—table 6

Unbinding Pathways Explored by ABMD (RMSDBNZ as CV).

https://doi.org/10.7554/eLife.17505.041
k=20 kJ/(mol · nm−2)k=50 kJ/(mol · nm−2)
IndexLengthPathLengthPath
RUN156 nsP127 nsP2
RUN236 nsP278 nsP1
RUN343 nsP16 nsP1
RUN443 nsP135 nsP1
RUN577 nsP210 nsP1
RUN6176 nsP144 nsP1
RUN741 nsP118 nsP1
RUN8106 nsP115 nsP1
RUN972 nsP17 nsP1
RUN10107 nsP12 nsP1
RUN1161 nsP120 nsP2
RUN1258 nsP226 nsP1
RUN1364 nsP131 nsP2
RUN14173 nsP220 nsP1
RUN15172 nsP134 nsP1
RUN1674 nsP222 nsP1
RUN1720 nsP117 nsP1
RUN1834 nsP135 nsP2
RUN1991 nsP121 nsP2
RUN2061 nsP118 nsP1
Cost1.6 μs0.5 μs
Summary
P175% (15/20)75% (15/20)
P225% (5/20)25% (5/20)

References

  1. 1
  2. 2
  3. 3
    Molecular Modeling of Proteins
    1. A Barducci
    2. J Pfaendtner
    3. M Bonomi
    (2015)
    151–171, Tackling sampling challenges in biomolecular simulations, Molecular Modeling of Proteins, Springer.
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
    Impact of atomistic molecular dynamics simulations on understanding how proteins fold: an experimentalist’s perspective
    1. WA Eaton
    2. V Muñoz
    (2014)
    Bioinformatics.
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27
  28. 28
  29. 29
  30. 30
  31. 31
  32. 32
  33. 33
  34. 34
  35. 35
    Pathways of conformational transitions in proteins
    1. P Majek
    2. R Elber
    3. H Weinstein
    (2008)
    In: G. A Voth, editors. Coarse-Graining of Condensed Phase and Biomolecular Systems ((Ed).). Boca Raton: FL: CRC Press. pp. 185–203.
  36. 36
  37. 37
  38. 38
  39. 39
  40. 40
  41. 41
  42. 42
    Rapid and accurate calculation of protein 1H, 13C and 15N chemical shifts
    1. S Neal
    2. AM Nip
    3. H Zhang
    4. DS Wishart
    (2003)
    Journal of Biomolecular NMR 26:215–240.
  43. 43
  44. 44
  45. 45
  46. 46
  47. 47
  48. 48
  49. 49
  50. 50
  51. 51
  52. 52
  53. 53
  54. 54
  55. 55
  56. 56
  57. 57
  58. 58
  59. 59
  60. 60
  61. 61
  62. 62
  63. 63
  64. 64
  65. 65
  66. 66
  67. 67
  68. 68
  69. 69

Decision letter

  1. Yibing Shan
    Reviewing Editor; DE Shaw Research, United States

In the interests of transparency, eLife includes the editorial decision letter and accompanying author responses. A lightly edited version of the letter sent to the authors after peer review is shown, indicating the most substantive concerns; minor comments are not usually included.

Thank you for submitting your article "Mapping transiently formed and sparsely populated conformations on a complex energy landscape" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom, Yibing Shan, is a member of our Board of Reviewing Editors, and another is James S Fraser (Reviewer #3), and the evaluation has been overseen by Michael Marletta as the Senior Editor. The reviewers have discussed the reviews with one another and the Reviewing Editor has drafted this decision to help you prepare a revised submission.

Summary:

In this manuscript, the authors employ state-of-art enhanced sampling techniques, including multiple variants of metadynamics, replica-averaged restrained MD and well-tempered replica-exchange MD, to conduct a systematic characterization of conformational dynamics of the L99A variant of T4L. This simulation study, which is shown to be in agreement with experimental measures of transition rates and kinetics, provides insights on the possible mechanism of conformational exchange between the ground state and the excited state. In particular, the excited state is shown to be conformationally broad and heterogeneous, and the transition path between the two states to be diverse. The work also identifies an key intermediate state for the transition, although its confirmation requires further experimental evidence.

Essential revisions:

Appreciating the thoroughness and the technical sophistication of the computation work and its robust connections with experimental data, the reviewers recognize the manuscript's value in benchmarking state-of-art simulation techniques and especially, in benchmarking conformational free energy calculations.

One of the more interesting parts of the of this study concerns the calculation of kinetic rate of the transition for L99A and the results is very encouraging. Because of its potential importance, this work would benefit greatly from additional calculations on mutants besides L99A. Specifically, the transition kinetics of L99A/G113A and L99A/G113A/R119P have been experimentally characterized. Calculating the kinetic rates of these two mutants and comparing the results with the experimental data would much strengthen case for the validity of such calculations.

The manuscript's attempt to elucidate the mechanism of ligand entrance is relatively weak. Further studies, for example, using simulations with ligand placed at the putative entrance to the channel in a partially open conformation would strengthen this part of the manuscript. If unsuccessful, the authors should consider weaken the related claims and de-emphasize the part of the work.

Relatedly, the reviewers think the manuscript's value lies more in benchmarking and validating state-of-art free energy calculation, than in providing significant new insight into the mechanism and function of T4L conformational transition. We would like to see revisions that de-emphasize the T4L specific claims and additional discussions with respect to benchmarking quantitative characterization of free energy landscapes using atomistic simulations.

https://doi.org/10.7554/eLife.17505.042

Author response

Essential revisions:

Appreciating the thoroughness and the technical sophistication of the computation work and its robust connections with experimental data, the reviewers recognize the manuscript's value in benchmarking state-of-art simulation techniques and especially, in benchmarking conformational free energy calculations.

One of the more interesting parts of the of this study concerns the calculation of kinetic rate of the transition for L99A and the results is very encouraging. Because of its potential importance, this work would benefit greatly from additional calculations on mutants besides L99A. Specifically, the transition kinetics of L99A/G113A and L99A/G113A/R119P have been experimentally characterized. Calculating the kinetic rates of these two mutants and comparing the results with the experimental data would much strengthen case for the validity of such calculations.

We thank the reviewers for their thoughtful comments on our work. As discussed in more detail below, we agree that the benchmarking aspects of our work are important. We have therefore extended our work and performed additional simulations and analyses. In particular, we have calculated the kinetics of exchange of the L99A/G113A/R119P triple mutant. As in the case of the L99A variant, we found results that were in good agreement with experiment. Specifically, the rates of exchange are within a factor of 10 from the experimental rates, and the free energy difference estimated from the rates (-1.2 kcal/mol) is very similar to the value (-1.6 kcal/mol) obtained from the equilibrium sampling. These results are presented in the revised manuscript.

As suggested by the reviewers we have also performed extensive simulations of the L99A/G113A double mutant. At the end of a 1 microsecond long PathMetaD simulation, the free energy difference obtained is ~ -0.5 kcal/mol, in apparently good agreement with experiment (0.3 kcal/mol). In contrast to the results from L99A and the triple mutant we, however, find that this value appears not to be converged with substantial fluctuations and a (non-converged) free energy landscape that differs in shape from that of L99A and L99A/G113A/R119P. Thus, the calculated free energy difference is (in contrast to L99A and the triple mutant) more sensitive to how the major and minor states are separated. Thus, the good agreement that we observe could very well be fortuitous, and we have therefore opted not to include them in the revised manuscript. Instead we mention that we have performed these simulations, but that they did not converge.

The manuscript's attempt to elucidate the mechanism of ligand entrance is relatively weak. Further studies, for example, using simulations with ligand placed at the putative entrance to the channel in a partially open conformation would strengthen this part of the manuscript. If unsuccessful, the authors should consider weaken the related claims and de-emphasize the part of the work.

We agree that the link between the transiently opened tunnel and the pathway for ligand binding and release was weak (as indeed we stated). While we believe that a more quantitative study lies outside the scope of this work, we have now performed and include the results from a set of simulations aimed to demonstrate a more direct link between the tunnel observed in the apo protein and the pathway for ligand (un)binding. In particular, we have used so-called adiabatic biased molecular dynamics (ABMD) to study the mechanism by which the ligand escapes its internal binding site. In these simulations, a bias is introduced for the ligand to leave its binding site, but leaving the specific pathway to be determined by the molecular simulations. Of the 20 ligand unbinding events we observed in such simulations, 15 take a path located in the same region as the transiently opened tunnel we found in the apo protein. To examine the extent to which this result depends on the amount of bias used, we performed these simulations with two different strengths of the biasing force, and reassuringly found the same results in both cases (i.e. 15/20 in both cases). These new results are included in the manuscript. We now also explicitly show the tunnel induced by binding of an alkyl-benzene to the protein.

Relatedly, the reviewers think the manuscript's value lies more in benchmarking and validating state-of-art free energy calculation, than in providing significant new insight into the mechanism and function of T4L conformational transition. We would like to see revisions that de-emphasize the T4L specific claims and additional discussions with respect to benchmarking quantitative characterization of free energy landscapes using atomistic simulations.

We agree that a key result of our work lies in the testing and benchmarking of the ability of molecular simulations to map conformational free energy landscapes and determine the free energy and kinetics associated with conformational exchange. Prompted by the suggestion of the reviewers we have thus strengthened these aspects in several parts of the manuscript. This also includes the newly described kinetics for the triple mutant, as well as the fact that for both L99A and the triple mutant we find good agreement between the kinetic and equilibrium estimate of the free energy difference. In addition, we have toned down some of the more specific points that mostly relate to T4L. For example, we have now completely removed the section and results describing our studies of domain-domain motions in the minor state of L99A T4L. We have also shortened somewhat the description of the results relating to the mechanism of conformational exchange, and finally we have reorganized the sections to bring together the validation parts more coherently. We believe that these changes together make the focus of the paper stronger.

https://doi.org/10.7554/eLife.17505.043

Article and author information

Author details

  1. Yong Wang

    Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
    Contribution
    YW, Conception and design, Acquisition of data, Analysis and interpretation of data, Drafting or revising the article
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-9156-0377
  2. Elena Papaleo

    Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
    Contribution
    EP, Acquisition of data, Analysis and interpretation of data
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-7376-5894
  3. Kresten Lindorff-Larsen

    Structural Biology and NMR Laboratory, Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
    Contribution
    KL-L, Conception and design, Analysis and interpretation of data, Drafting or revising the article
    For correspondence
    lindorff@bio.ku.dk
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-4750-6039

Funding

Novo Nordisk Foundation

  • Kresten Lindorff-Larsen

Carlsbergfondet

  • Kresten Lindorff-Larsen

Danish e-Infrastructure Cooperation

  • Kresten Lindorff-Larsen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank Pratyush Tiwary and Matteo Salvalaglio for the help on the usage and analysis of InMetaD, Gareth Tribello for the help on the usage of reconnaissance metadynamics, Lewis E. Kay, Pengfei Tian, Shuguang Yuan, Micha Ben Achim Kunze for their helpful discussions and comments. Ludovico Sutto and Francesco Gervasio are thanked for input on metadynamics simulations.

Reviewing Editor

  1. Yibing Shan, DE Shaw Research, United States

Publication history

  1. Received: May 4, 2016
  2. Accepted: August 22, 2016
  3. Accepted Manuscript published: August 23, 2016 (version 1)
  4. Version of Record published: October 4, 2016 (version 2)

Copyright

© 2016, Wang et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 1,845
    Page views
  • 446
    Downloads
  • 24
    Citations

Article citation count generated by polling the highest count across the following sources: Scopus, Crossref, PubMed Central.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Biochemistry and Chemical Biology
    Ma Teresa Pellicer Martinez et al.
    Research Article
    1. Biochemistry and Chemical Biology
    2. Structural Biology and Molecular Biophysics
    Michael S Woody et al.
    Research Article