Abstract
The relationship between protein dynamics and function is essential for understanding biological processes and developing effective therapeutics. Functional sites within proteins are critical for activities such as substrate binding, catalysis, and structural changes. Existing computational methods for the predictions of functional residues are trained on sequence, structural and experimental data, but they do not explicitly model the influence of evolution on protein dynamics. This overlooked contribution is essential as it is known that evolution can fine tune protein dynamics through compensatory mutations, either to improve the proteins’ performance or diversify its function while maintaining the same structural scaffold. To model this critical contribution, we introduce DyNoPy, a computational method that combines residue coevolution analysis with molecular dynamics (MD) simulations, revealing hidden correlations between functional sites. DyNoPy constructs a graph model of residue-residue interactions, identifies communities of key residue groups and annotates critical sites based on their roles. By leveraging the concept of coevolved dynamical couplings—residue pairs with critical dynamical interactions that have been preserved during evolution—DyNoPy offers a powerful method for predicting and analysing protein evolution and dynamics. We demonstrate the effectiveness of DyNoPy on SHV-1 and PDC-3, chromosomally encoded β-lactamases linked to antibiotic resistance, highlighting its potential to inform drug design and address pressing healthcare challenges.
Introduction
Quantifying the contribution of individual residues or residue groups to protein function is important to estimate the pathogenic effect of mutations (1). Identifying the functional roles of individual residues has primarily been done through mutagenesis experiments (2). Bioinformatics methods have complemented these approaches through analysis of multiple sequence alignments (MSA) of homologous proteins and structural data (3–8). Among these methods, computational techniques that can decode inter-residue evolutionary relationships from MSAs have paved the way for machine learning (ML) based strategies that can predict protein structure (9–12), stability (13), and function (7) and extend the scope of computational protein design (14–16). A most recent approach has combined experimental data from three proteins, NUDT15, PTEN and CYP2C9, on stability and function with sequence and structural features to train a ML model to predict functional sites (17).
Functional sites are often regulated by both, local and global interactions. Changes in these interactions are instrumental for functional events like substrate binding, catalysis, and conformational changes (18). While techniques described earlier have exploited the sequence and structure relationship of proteins, they do not directly consider the time evolution of functional events in their predictions. The development of physical models of protein dynamics and the increase in available computational power has stimulated the adoption of computational techniques (19, 20) to investigate the conformational dynamics of proteins, an essential component of the many biological functions (21, 22). It has also been established that evolution through compensatory mutations in dynamic regions, like hinges and loops, can fine tune protein structural dynamics and introduce promiscuity, thereby diversifying biological function. Assuming that protein functional dynamics is conserved during evolution, significant information on dynamic regions and substrate recognition sites should be recoverable using inter residue coevolution scores extracted from MSAs (23, 24). Coevolution analysis and Molecular Dynamics (MD) simulations have independently (25) and synergistically been combined in the past to identify important residues for function (26–30). Yet a method that combines hidden information on dynamics from evolution with direct information on local and global dynamics from conformational ensembles from MD is not available.
Here, we present DyNoPy, a computational method that can extract hidden information on functional sites from the combination of pairwise residue coevolution data and powerful descriptors of dynamics extracted from the analysis of MD ensembles. The method can detect coevolved dynamic couplings, i.e. residue pairs with critical dynamical interactions that have been preserved during evolution. These pairs are extracted from a graph model of residue-residue interactions. Communities of important residue groups are detected, and critical sites are identified by their eigenvector centrality in the graph (Fig. 1). We demonstrate the power of this approach on SHV-1 and PDC-3 β-lactamases of major clinical importance (31, 32). DyNoPy successfully detects residue couplings that align with previous studies, guide in the explanations of mutation sites with previously unexplained mechanisms and provide predictions on plausible important sites for the emergence of clinically relevant variants.
Results and Discussion
β-lactamases are a group of enzymes capable of hydrolysing β-lactams, conferring resistance to β-lactam antibiotics (33). These enzymes are evolving rapidly, as single amino acid substitutions are sufficient to drive their evolution and increase their catalytic spectrum and inhibitor resistance profile (34). The widespread dissemination of β-lactamases across different bacterial species and their extensive emergence highlight their global impact on antibiotic resistance (35). The rapid evolution of β-lactamases and their clinical significance (34) makes them an ideal target for evaluating the robustness of DyNoPy.
In this study, we applied DyNoPy to two model enzymes from different β-lactamase families: class A β-lactamase SHV-1 (a chromosomally encoded enzyme in Klebsiella pneumoniae) and class C β-lactamase PDC-3 (a chromosomally encoded enzyme in Pseudomonas aeruginosa) (31, 32) (Fig. S1 and S2). Both class A and class C β-lactamases comprise an α/β domain and an α helical domain, with the active site situated in between (36, 37). Moreover, both enzymes target the carbonyl carbon of the β-lactams using a highly conserved serine residue (38, 39). Despite these similarities, the structures of class A and class C β-lactamases are remarkably different (Fig. 2).
SHV-1 is a very well characterised enzyme with wealth of information on mutations and their corresponding effects on protein function. In contrast, the information available on PDC-3 remains limited. Detailed structural information on these enzymes can be found in the supporting materials. Essential catalytic residues in SHV-1 are: S70, K73, S130, E166, N170, K234, G236, and A237 (40) and conserved catalytic residues in PDC-3 include S64, K67, Y150, N152, K315, T316, and G317. Highly conserved stretches of 3-9 hydrophobic residues, annotated as hydrophobic nodes, exists in class A β-lactamases and have been proven to be essential for protein stability (41). Residues defined as belonging to hydrophobic nodes within SHV-1 are listed in Supporting Table S1.
In SHV-1, the predominant extended spectrum β-lactamase (ESBL) substitutions occur at L35, G238, and E240, while R43, E64, D104, A146, G156, D179, R202, and R205 appear in ESBLs with lower frequency (42). Mutations at M69, S130, A187, T235, and R244 are known to induce inhibitor resistance in the enzyme (43). In PDC-3, substitutions primarily occur on the Ω-loop, enhancing its flexibility to accommodate the bulky side chains of antibiotics, while deletions are more common in the R2-loop (39). The predominant Ω-loop mutations isolated from clinics are found at positions V211, G214, E219, and Y221 (44).
Emergence of highly conserved dynamic couplings
DyNoPy builds a pairwise model of conserved dynamic couplings detected by combining coevolution scores and information on functional motions into a score Jij (see Methods and Fig. 1). To this end a dynamics descriptor should be selected. When the descriptor is associated with functional conformational changes, it is expected that functionally relevant couplings will report higher scores. Dynamics descriptors can be selected from commonly used geometrical collective variables (CVs) for the analysis of MD trajectories (see Methods). As expected, the average J matrix score varies across the different CVs, with some of them showing no signal of dynamic coupling (Supporting Fig. S4C).
SHV-1 and PDC-3 exhibit distinct dynamics, requiring a different choice of the CV that best captures the functional dynamics. For SHV-1, the global first principal component (PC1) proved to be the most effective feature, identifying 571 residue pairs with a Jij value greater than 0. Conversely, PDC-3 requires selection of more localized features that can extract the Ω-loop dynamics from the overall protein motion. Among the dynamic descriptors, the partial first time-lagged component (TC1_partial) performed best for PDC-3, detecting 216 residue pairs with a Jij value greater than 0. Consequently, PC1 and TC1_partial were selected to build the J matrix for SHV-1 and PDC-3, respectively. The performance of all 12 CVs for each protein was assessed and listed in the Supporting Table S2.
The importance of dynamical information is evident when coevolution couplings (γij) and conserved dynamic couplings (Jij) are compared: the number of non-zero couplings decrease from 40% to <2% of total residue pairs in the protein (Fig. S4D) when information from the dynamics descriptor is added. Thus, the inclusion of protein dynamics in coevolution studies acts as an effective filter that rules out residue pairs that do not have significant correlations with functional motions. Moreover, when relying only on γij, all the residues in SHV-1 and PDC-3 are included within four identified communities (Supporting Table S3), suggesting that coevolution scores (γij) alone do not effectively discriminate residues relevant for protein functions. Furthermore, it would be hard to distinguish critical core residues for each community using only γij, as the eigenvector centrality (EVC) values for the residues do not show remarkable differences (Supporting Fig. S5A and S5B). This means that detailed dynamic investigation of the top residues is needed to determine which pairs should be picked up and further analysed. On the other hand, it is much easier to identify essential residues based on J scores calculated, as clear outliers with significantly higher EVC values could be seen for almost all communities (Supporting Fig. S5C and S5D) (25, 45). In conclusion, the lack of specificity in the statistically based coevolution analysis supports the choice of incorporating a score for the correlation between residue interactions and dynamic behaviours that enables deconvolution of community information.
DyNoPy reveals critical residues and predicts evolutionary pathways in SHV-1
DyNoPy identified eight significant communities of strongly coupled residues within SHV-1 (Supporting Fig. S4A) with all crucial catalytic residues and critical substitution sites previously mentioned participating in one of these communities with the exceptions of R43, R202, and S130.
Residues previously known to have critical role in function or conferring ESBLs/IRBLs phenotype are either directly coupled to protein dynamics or act as a central hub. The hubs interact with residues with either a role in catalysis or structural stability through their membership of hydrophobic nodes (31). Furthermore, DyNoPy identified key positions (L162 and N136) within some communities that are known to undergo substitutions, conferring an ESBL phenotype in other class A β-lactamases. These substitutions have not yet emerged in the SHV family, providing insightful predictions about the potential future evolution of the enzyme. Detailed description of the other three communities is provided in the supporting information (Fig. S6).
DyNoPy predicts mutation hotspots in SHV-1
DyNoPy detects critical mutation sites (L162 and N136) that are known to extend the range of substrates in other class A β-lactamases but have not yet emerged as variants in the SHV family. These sites have not been modified in SHV family because of their plausible central role within the communities as they are mediating couplings with key functional residues essential for catalytic activity and structural stability, indicating their critical role in protein function and the potential lower mutation rate. These findings provide insightful predictions about the potential future evolution of the enzyme, as well as plausible explanations for why these mutations have not yet appeared.
L162, positioned at the start of the Ω-loop and adjacent to the crucial catalytic residue E166, is assigned as the core residue for community 1 (Fig. 3A). While it remains conserved in SHV family, variants of L162 have been isolated in other class A β-lactamase and are known to expand the enzyme catalytic spectrum. Single amino acid substitution at L162 can intensify antibiotic resistance in BEL-1 (46), a class A ESBL clinical variant, exhibiting robust resistance to ticarcillin and ceftazidime (47). BEL-2 diverges from BEL-1 by single amino acid substitution (L162F) which alters the kinetic properties of the enzyme significantly and increases its affinity towards expanded-spectrum cephalosporins (48). The relationship between L162 and protein catalytic functions can be explained using DyNoPy model, as there are couplings with catalytic important residues M69, K73, E166, and K234. Moreover, the BEL case has confirmed that L162F mutation significantly destabilizes the overall protein structure, highlighting the crucial role of L162 in maintaining protein stability (46). DyNoPy accurately identifies the centrality of L162 by reporting its connections with 28 backbone residues, including nine hydrophobic node residues critical for protein stability. Among these, five hydrophobic residues are part of the α2 node: V75, L76, G78, V80, and L81, highlighting the contribution of L162 to the stability of the α2 helix (31).
Just like L162, N136 undergoes advantageous mutations in other class A β-lactamases while remains highly conserved within the SHV family. It is the core residue for community 7 (Fig. 4B). This residue forms a hydrogen bond with E166, stabilizing the Ω-loop (49). N170 acts as an intermediary between N136 and E166. N170, an essential catalytic residue located on the Ω-loop, contributes to priming the water molecule for the deacylation step with E166 (50) and is directly coupled with N136. Due to the essential contribution of N136 in facilitating E166 to maintain its proper orientation, it was previously thought to be intolerant to mutations as substitution of Asparagine to Alanine at this position would make the enzyme lose its function completely (51). However, N136D substitution has emerged as a new clinical variant very recently in PenL, a class A β-lactamase, by increasing its ability in hydrolysing ceftazidime (51), suggesting that this site has potential to mutate. This gain of function is mainly triggered by the increased flexibility of the Ω-loop (51). DyNoPy correctly detect a dynamical relationship between N136 and the Ω-loop (residues 164-179). Six residues present in the Ω-loop participate within this community, including R164 and D179. These two residues are critical as they are forming the ‘bottleneck’ of the Ω-loop which is essential for the correct position of E166 (52). D179 is also a critical mutation site for SHV-1. Single amino acid substitutions like D179A, D179N, and D179G are enough for the extended spectrum phenotype (42).
DyNoPy detects residue couplings essential for protein stability
DyNoPy identifies residue couplings critical for protein functional motions, particularly associated with protein stability. These residue pairs exhibit strong relationships as they are not only directly coupled with each other, but also forms various indirect couplings via other residues. As a result, both residues are considered as core residues inside these communities. It is expected that disruption of these couplings through mutation could compromise collective motions essential for enzyme activity.
As the secondary core residues in community 1 (Fig. 3A) F72 is showing a strong coupling with the primary core residue L162 and also forms nine indirect couplings with L162, including via the catalytic K234. This network of direct and indirect relationships reveals the importance of F72 and L162 coupling in maintaining protein functional motions. Interestingly, previous studies identified a small hydrophobic cavity formed by L162 and F72, together with L139, and L148, which is essential for the stability of the active site (46). Notably, DyNoPy successfully recovers the key residues of this local hydrophobic cavity (L162, F72, and L148).
The strong interplay between V103 and S106, which are both residues on the α3-α4 loop, is seen in community 5 (Fig. 3C). These residues not only interact with each other directly but are also indirectly coupled via 21 other residues. This community emphasizes the significance of hydrophobic nodes in SHV stability and dynamics. Within the analysed 48 residues, 27 are hydrophobic, out of which 15 residues act as nodes critical for enzyme stabilization. Hydrophobic nodes stabilize their own secondary structures and interconnect to stabilize the overall protein (53). V103 and S106 themselves are hydrophobic nodes, stabilizing α3 helix and α4 helix respectively, and are strongly coupled with each other. In CTX-M, another class A enzyme, N106S is a common substitution that results in improved thermodynamic stability and compensate for the loss in stability of the variants (54). Interestingly, this residue is already a Serine in SHV, but still implies its pivotal role in protein stability.
DyNoPy provides valid explanations for mutation sites
During the evolution of β-lactamases, single mutations on specific sites that are distant from the functional sites have been observed to significantly alter protein catalytic functions. Additionally, single mutations on some surface exposed residues can dramatically increase protein stability. Understanding how these distant mutations impact function and stability becomes a major challenge in understanding protein evolutionary pathways. Communities extracted by DyNoPy show these residues linked with functional important residues, providing a rational for these mutation sites with unknown functions.
Mutations of G156 are limited but they lead to ESBL phenotype in the SHV family (42). G156 is the central residue for community 4 (Fig. 3B), but it is distant from the active site, over 20Å away from the catalytic serine S70. Clinical variant SHV-27, has extended resistance ability towards cefotaxime, ceftazidime, and aztreonam (55). It differs from SHV-1 by single amino acid substitution G156D, suggesting that it has directly evolved from SHV-1 (55). Limited research has been done on position G156, and the understanding of how it affects the enzyme catalytic properties given that it is far away from the active site is still unclear. Based on our results, we suggest that this residue is essential for the overall protein function because of its 11 coevolved dynamic couplings with protein dynamics, including A146, another ESBL substitution site.
SHV-38, another ESBL that is capable of hydrolysing carbapenems, harbours a single A146V substitution compared to SHV-1 (56). Like G156, A146 is 15 Å away from S70 but shows an ability in altering protein catalytic function. The A146-G156 residue pair shows a strong coevolutionary signal and strong correlation with protein overall dynamics, implying that there may compensatory mutations at these sites with potential to emerge in the SHV family in the future. These two residues are not connected to any catalytic residues but their coupling to functional dynamics can offer plausible explanation to ESBL activity of these two mutations.
Unlike other substitution sites that are adjacent to the active site, R205 is situated more than 16 Å away from catalytic serine S70. Its side chain points outward from the protein, exposing to the solvent. The R205L substitution often co-occurs with other ESBL mutations and is thought to indirectly contribute to the ESBL phenotype by compensating for stability loss induced by other mutations (57). SHV-3 is an ESBL that exhibits significant resistance to cefotaxime and ceftriaxone (58). Two substitutions in this enzyme, R205L and G238S, extend its resistance profile (58). Thus, it is promising to see that DyNoPy detected these two mutation sites together within community 6 (Fig. 4A).
Y105 and R266 are the core residues for community 6. Y105 is situated on the α3-α4 loop positioned at the left side of the binding pocket. It is an important catalytic residue that recognizes and binds to the thiazolidine ring of penicillins or β-lactamase inhibitors (59). There is very limited information on the role of R266, except that it may stabilize the Ω-loop in the SHV family similar to the analogous T266 in TEM (60). G238 is coupled with an essential catalytic residue Y105, which further links with other catalytic functional residues: S70 and A237, and R266, a residue that known to stabilize the Ω-loop. This indicates that mutations on G238 would result in an alteration on protein catalytic function, as well as an increased flexibility of the protein, which strongly aligns with previous finding. Its linked mutation site R205 does not showing direct coupling with any catalytic residues. Instead, it is directly coupled with R266, which we mentioned as an Ω-loop stabilizer. Thus, it is not surprising that R205 substitution alone is never observed in nature (61), as it would not give significant evolutionary advantage to the protein.
Insights into unexplained functional sites of PDC-3
Unlike the extensively studied SHV-1, the functional roles of individual amino acids in PDC-3 remains largely unexplored. This gap in understanding serves as welcome challenge for interpreting the effects of mutations and the dynamic behaviour of PDC-3 from our results. Although several mutation hotspots, such as those on the Ω-loop (44), have been identified, very little is known about the specific contributions of individual amino acids on the functionality of PDC-3.
In PDC-3, mutations have primarily been reported in the Ω-loop. They enhance its flexibility to accommodate the bulky side chains of antibiotics, while deletions are more common in the R2-loop (39). DyNoPy detected five communities in total (Fig. S4B) with all the four predominant Ω-loop mutations appeared in these communities. Community 3, 4 and 5 are explained in the supporting information (Fig. S7). Furthermore, DyNoPy also detected several previously unexplored Ω-loop residues.
G214, a known mutation site in PDC-3, is the core residue in community 1. Another two essential mutation sites: E219 and Y221 also participate in this community, directly coupled with G214 (Fig. 5A). G214 also has direct couplings with four other Ω-loop residues: A195, A197, G212, and L216. Previous results have demonstrated that substitutions of Glycine to Alanine or Arginine at 214 significantly destabilizes the Ω-loop (32). The strong correlation between G214 and these Ω-loop residues emphasizes the significant contribution of G214 towards the stability of the Ω-loop, which corroborates with previous results (32). Moreover, substitutions such as G214A and G214R and mutations on E219 and Y221 do not affect R2 loop flexibility, resulting in the smaller active site volume among variants (32) because none of the residues from the R2 loop are detected in this community offering plausible explanation to previously unexplained phenomenon.
G204 is the core residue of community 2, coupled with 73 other residues, most of which are distant from the catalytic site, suggesting plausible crucial role in overall protein stability like L162 in SHV-1 (Figure 5B). G204, a newly emerged mutation site in the PDC family (62), is located on the short β-sheet β5a within the Ω-loop, near the hinge region between β8 and β9 just above the active site. The only known variant of G204 is PDC-466, which was derived from PDC-462 (A89V, Q120K, V211A, N320S), with an addition of G204D (62). Coupling of G204 to several catalytically important residues, including K67, K315, and T316 can suggest that mutations at this site can negatively impact catalytic power. This offers a plausible explanation of seeing fewer variants at this site and mutations at this site could have impact on hydrolysing capabilities of PDC variants. This should be confirmed by further experimental studies of variants of G204. Unlike G214, E219 and Y221 mutations which do not influence the dynamics of the R2 loop, substitutions on V211, a member of Ω-loop, has impact on dynamics of R2 loop because of its indirect couplings, through G204 to R2-loop residues (32). Two less critical substitution sites, H188 and V329, were also observed in community 2.
Conclusions
DyNoPy combines information on residue-residue coevolution and protein dynamics captured from MD ensembles to detect conserved dynamic couplings. These couplings are modelled as a graph. The network analysis facilitates the extraction of epistatic subnetworks and the assignment of roles to residues based on their importance in the graph model. The choice of a relevant descriptor of protein dynamics has an impact on the ability to detect couplings that are involved in functional dynamics. Here we demonstrated how the choice of relevant global and local descriptors returns a higher number of effective couplings (greater than 0), and in turn leads to interpretable graph models and communities. In other systems, when multiple descriptors can be used to quantify functional conformational change, it is expected that they will differently modulate the effect of coevolution coupling, which will be reflected in a different structure of the associated graph models. This suggests the use of DyNoPy to generate comparative models in proteins with multiple functions associated to distinct dynamical changes.
Mutations of L162 and N136 have not yet emerged in SHV-1, but they are detected by DyNoPy as core residues for communities. These residues are strongly coupled with other functional important residues, which play critical roles in protein stability and catalytic activity. The identification of these couplings shows high consistency with previous studies and highlights the importance of L162 and N136 in SHV-1 functional dynamics. Given their central role in these communities, mutations in L162 and N136 can significantly alter protein function, suggesting their potential for future evolutionary changes. However, their strong relationships with these critical functional residues also suggest that mutation at these sites would need to be balanced to maintain protein function, providing an explanation for why such mutations have not yet emerged in SHV-1 (63). The ability of DyNoPy in detecting functionally important mutation sites was demonstrated via well-characterized mutation sites including R205 and G238 from SHV-1. Moreover, DyNoPy shows predictive ability on less-studied mutation sites such as G156 and A146, by detecting critical residue couplings that coevolved with functional motions.
Based on the knowledge we have gained from analysis of SHV-1 functional protein dynamics we suggest that in PDC-3, mutations at G204 because of its significant conserved dynamic couplings can lead to new ESBL/IRBL clinical variants. We suggest that DyNoPy can be used as a predictive tool to identify potential functional residues within this enzyme and guide future mutagenesis studies.
In summary, by integrating hidden evolutionary information with direct dynamic interactions, DyNoPy provides a powerful framework for identifying and analysing functional sites in proteins. The tool not only identifies key residues involved in local and global interactions, but also improves our ability to predict silent residues with previously unknown roles for future experimental testing. Our application of DyNoPy to broad-spectrum β-lactamases ESBLs and IRBLs demonstrates its potential to address key medical challenges such as antibiotic resistance by providing valid predictions on protein evolution.
Methodology
DyNoPy generates a graph representation of the protein structure that captures the couplings between amino acid residues contributing to the functional dynamics of the protein. Residues are represented as graph nodes, and conserved dynamic couplings are recorded as edges. Edge weights quantify the strength of these couplings. The model is built on two assumptions: residue pairs should have i) coevolved and their ii) time-dependent interactions correlate with a functional conformational change.
Therefore, edge weights (Jij) for residue i and j are calculated as:
where γij is the scaled coevolution score and ρij is the degree of correlation with the selected functional conformational change. α and β are weights assigned to γij and ρij that have a sum of one. The relative weight of the scaled coevolution score (α) is set to 0.5 in this study. When either of the assumptions listed above is not met, Jij is set to zero.
Scaled coevolution scores
The occurrence of residue-residue coevolution can be estimated and quantified using probabilistic models of correlated mutations from deep multiple sequence alignments (MSA). DyNoPy supports generation of the MSA using the HH-Suite package (64) and calculation of scaled coevolution score (γij) using CCMpred (65). First a pairwise residue coevolution matrix (C) is calculated, then these raw scores (cij) are divided by the matrix mean (Equation 2). All scores (sij) smaller than 1 are set to zero, and the remaining values are normalised by the maximum value (Equation 3):
Correlation with functional motions
The contribution of a residue pair to a selected functional motion is estimated by how much the change in interaction energy between the two residues over time is correlated with a collective variable (CV) describing the functional motion:
where εij(t) is the pairwise non-bonded interaction energy (see details in Supporting Information) and d(t) is the time-dependent value of the CV. Examples of CV and a discussion on the choice of the most relevant CV is presented in the results section. Correlation values smaller than 0.5 are set to 0.
Graph representation and analysis of conserved dynamic couplings
All pairwise conserved dynamic couplings (Equation 1) are collected into a square matrix J. A graph is built from J, using python-igraph v0.11 library (66). Nodes represent residues, and edges are drawn between nodes with positive Jij. Edge weights are set to Jij. The relative importance of the residues in this model of protein dynamics is calculated as eigenvector centrality of the nodes (67). Residues involved in extensive correlated dynamics with other highly connected residues have higher eigenvector centrality (EVC) scores. Groups of residues contributing to important collective motions are detected by community analysis of the graph structure. The Girvan-Newman algorithm is used to extract the community structure (68).
Adaptive Sampling Molecular Dynamics Simulations
MD simulation data was sourced from our previous studies (31, 32). To summarise, SHV-1 structural coordinates (PDB ID: 3N4I) were obtained from the Protein Data Bank and modified to the wild type by introducing the E104D mutation. Similarly, the PDC-3 structure was derived from PDC-1 (PDB ID: 4HEF) by a T105A substitution. Both enzymes were protonated at pH 7.0 using PropKa from the PlayMolecule platform (69). One disulfide bond between C77 and C123 was specified in SHV-1. Both structures were solvated with TIP3P water molecules in a periodic box with a box size of 10 Å. Ions were added to neutralize the overall charge of each system at 150mM KCl. Amber force field ff14SB was used for all MD simulations (70). After an initial minimisation of 1000 steps, both the enzymes were equilibrated for 5 ns in the NPT ensemble at 1 atmospheric pressure using the Berendsen barostat (71). The initial velocities for each simulation were sampled from the Boltzmann distribution at 300 K. Multiple Markov State Model (MSM)-based adaptively sampled simulations were performed for both proteins based on the ACEMD engine (72, 73). A canonical (NVT) ensemble with a Langevin thermostat (74) (damping coefficient of 0.1 ps−1) and a hydrogen mass repartitioning scheme were employed to achieve time steps of 4 fs. For SHV-1, each trajectory spanned 60 ns with a time step of 0.1 ns, with a total of 593 trajectories. In the case of PDC-3, 100 trajectories were collected, each containing 3000 frames, lasting 300 ns. To manage the extensive datasets efficiently, trajectories were strategically stridden to ensure that a minimum of 30,000 frames were preserved for each system. The resulting trajectories are summarized in Supporting Table S4.
Calculation and Selection of Collective Variables
Any vector that yields time-dependent information can be used as a collected variable (CV) to describe protein functional motions. The usefulness of DyNoPy is dependent on the choice of the CVs. To guide the selection of CVs, we selected 12 distinct features: radius of gyration (Rg), the first principal component (PC1), partial PC1 (PC1_partial), the first time-lagged independent component (TC1), partial TC1 (TC1_partial), global root mean square deviation (gRMSD), partial RMSD (pRMSD), dynamical RMSD (dRMSD), global solvent accessible surface area (gSASA), partial SASA (pSASA), active site pocket volume, and the number of hydrogen bonds (hbond). A description of the CVs, including the calculation methods and the residues used to calculate the partial variables, is detailed in the supporting information. CVs were subsequently used as input features for DyNoPy. The selection criteria for CVs were based on the number of residue pairs each CV could detect.
Data Availability
All files required to run the simulations (topology, coordinates, input), processed trajectories (xtc), corresponding coordinates (pdb), can be downloaded from the DOI https://doi.org/10.57760/sciencedb.15876 (PDC-3) and 10.5281/zenodo.13693144 (SHV-1). DyNoPy is available at https://github.com/alepandini/DyNoPy.
Acknowledgements
SCD was supported by Leverhulme Trust grant RPG-2017-222 awarded to AP and JAG. The authors would like to thank Arianna Fornili for insightful suggestions on the design of DyNoPy methodology.
Additional files
References
- 1.The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studiesHum Genet 136:665–677
- 2.Multiplex assessment of protein variant abundance by massively parallel sequencingNat Genet 50:874–882
- 3.The Context-Dependence of Mutations: A Linkage of FormalismsPLoS Comput Biol 12
- 4.Predicting and interpreting large-scale mutagenesis data using analyses of protein stability and conservationCell Rep 38
- 5.Rapid protein stability prediction using deep learning representationsElife 12
- 6.Exploring amino acid functions in a deep mutational landscapeMol Syst Biol 17
- 7.Mutation effects predicted from sequence co-variationNat Biotechnol 35:128–135
- 8.A large-scale evaluation of computational protein function predictionNat Methods 10:221–227
- 9.Highly accurate protein structure prediction with AlphaFoldNature 596:583–589
- 10.Evolutionary-scale prediction of atomic-level protein structure with a language modelScience 379:1123–1130
- 11.Accurate prediction of protein structures and interactions using a three-track neural networkScience 373:871–876
- 12.Protein structure prediction from sequence variationNat Biotechnol 30:1072–1080
- 13.Computational Modeling of Protein Stability: Quantitative Analysis Reveals Solutions to Pervasive ProblemsStructure 28:717–726
- 14.Protein design using structure-based residue preferencesNat Commun 15
- 15.Machine learning-assisted directed protein evolution with combinatorial librariesProc Natl Acad Sci U S A 116:8852–8858
- 16.An evolution-based model for designing chorismate mutase enzymesScience 369:440–445
- 17.Discovering functionally important sites in proteinsNat Commun 14
- 18.Allostery in Its Many Disguises: From Theory to ApplicationsStructure 27:566–578
- 19.The Role of Conformational Dynamics and Allostery in Modulating Protein EvolutionAnnu Rev Biophys 49:267–288
- 20.DynaMut2: Assessing changes in stability and flexibility upon single and multiple point missense mutationsProtein Sci 30:60–69
- 21.Dynamic personalities of proteinsNature 450:964–972
- 22.Conformational diversity and protein evolution--a 60-year-old hypothesis revisitedTrends Biochem Sci 28:361–368
- 23.Patterns of coevolving amino acids unveil structural and dynamical domainsProc Natl Acad Sci U S A 114:E10612–E10621
- 24.Sequence evolution correlates with structural dynamicsMol Biol Evol 29:2253–2263
- 25.Amino acid positions subject to multiple coevolutionary constraints can be robustly identified by their eigenvector network centrality scoresProteins 83:2293–2306
- 26.SPECTRUS: A Dimensionality Reduction Approach for Identifying Dynamical Domains in Protein Complexes from Limited Structural DatasetsStructure 23:1516–1525
- 27.From residue coevolution to protein conformational ensembles and functional dynamicsProc Natl Acad Sci U S A 112:13567–13572
- 28.Statistical coevolution analysis and molecular dynamics: identification of amino acid pairs essential for catalysisProc Natl Acad Sci U S A 102:994–999
- 29.Integrating molecular dynamics and co-evolutionary analysis for reliable target prediction and deregulation of the allosteric inhibition of aspartokinase for amino acid productionJ Biotechnol 154:248–254
- 30.Molecular dynamics simulations and statistical coupling analysis reveal functional coevolution network of oncogenic mutations in the CDKN2A-CDK6 complexFEBS Lett 587:136–141
- 31.The Role of Hydrophobic Nodes in the Dynamics of Class A beta-LactamasesFront Microbiol 12
- 32.Omega-Loop mutations control the dynamics of the active site by modulating a network of hydrogen bonds in PDC-3 beta-lactamasebioRxiv https://doi.org/10.1101/2024.02.04.578824
- 33.Resistance to beta-lactam antibioticsCell Mol Life Sci 61:2200–2223
- 34.Past and Present Perspectives on beta-LactamasesAntimicrob Agents Chemother 62
- 35.Proliferation and significance of clinically relevant beta-lactamasesAnn N Y Acad Sci 1277:84–90
- 36.Catalytic properties of class A beta-lactamases: efficiency and diversityBiochem J 330:581–598
- 37.Class C beta-Lactamases: Molecular CharacteristicsClin Microbiol Rev 35
- 38.Structural and Mechanistic Basis for Extended-Spectrum Drug-Resistance Mutations in Altering the Specificity of TEM, CTX-M, and KPC beta-lactamasesFront Mol Biosci 5
- 39.AmpC beta-lactamasesClin Microbiol Rev 22:161–182
- 40.A standard numbering scheme for the class A beta-lactamasesBiochem J 276:269–270
- 41.Defining the architecture of KPC-2 Carbapenemase: identifying allosteric networks to fight antibiotics resistanceSci Rep 8
- 42.A Review of SHV Extended-Spectrum beta-Lactamases: Neglected Yet UbiquitousFront Microbiol 7
- 43.Tazobactam inactivation of SHV-1 and the inhibitor-resistant Ser130-->Gly SHV-1 beta-lactamase: insights into the mechanism of inhibitionJ Biol Chem 279:19494–19501
- 44.Deciphering the Evolution of Cephalosporin Resistance to Ceftolozane-Tazobactam in Pseudomonas aeruginosamBio 9
- 45.Eigenvector centrality for characterization of protein allosteric pathwaysProc Natl Acad Sci U S A 115:E12201–E12208
- 46.Crystal Structure of the Pseudomonas aeruginosa BEL-1 Extended-Spectrum beta-Lactamase and Its Complexes with Moxalactam and ImipenemAntimicrob Agents Chemother 60:7189–7199
- 47.Emergence and dissemination of BEL-1-producing Pseudomonas aeruginosa isolates in BelgiumAntimicrob Agents Chemother 51:1584–1585
- 48.BEL-2, an extended-spectrum beta-lactamase with increased activity toward expanded-spectrum cephalosporins in Pseudomonas aeruginosaAntimicrob Agents Chemother 54:533–535
- 49.Conserved water molecules stabilize the Omega-loop in class A beta-lactamasesAntimicrob Agents Chemother 52:1072–1079
- 50.Detailed investigation of catalytically important residues of class A beta-lactamaseJ Biomol Struct Dyn 41:2046–2073
- 51.Non-catalytic-Region Mutations Conferring Transition of Class A beta-Lactamases Into ESBLsFront Mol Biosci 7
- 52.The Structural Role of N170 in Substrate-Assisted Deacylation in KPC-2 beta-LactamaseAngew Chem Int Ed Engl 63
- 53.Allosteric communication in class A beta-lactamases occurs via cooperative coupling of loop dynamicsElife 10
- 54.An active site loop toggles between conformations to control antibiotic hydrolysis and inhibition potency for CTX-M beta-lactamase drug-resistance enzymesNat Commun 13
- 55.SHV-27, a novel cefotaxime-hydrolysing beta-lactamase, identified in Klebsiella pneumoniae isolates from a Brazilian hospitalJ Antimicrob Chemother 47:463–465
- 56.Emergence in Klebsiella pneumoniae of a chromosome-encoded SHV beta-lactamase that compromises the efficacy of imipenemAntimicrob Agents Chemother 47:755–758
- 57.Characterization of a novel extended-spectrum TEM-type beta-lactamase, TEM-164, in a clinical strain of Klebsiella pneumoniae in TunisiaMicrob Drug Resist 15:195–199
- 58.Molecular characterization of the gene encoding SHV-3 beta-lactamase responsible for transferable cefotaxime resistance in clinical isolates of Klebsiella pneumoniaeAntimicrob Agents Chemother 33:2096–2100
- 59.Role of Asp104 in the SHV beta-lactamaseAntimicrob Agents Chemother 50:4124–4131
- 60.Structure of the SHV-1 beta-lactamaseBiochemistry 38:5720–5727
- 61.A Genotype-Phenotype Correlation Study of SHV beta-Lactamases Offers New Insight into SHV Resistance ProfilesAntimicrob Agents Chemother 64
- 62.Development of antibiotic resistance reveals diverse evolutionary pathways to face the complex and dynamic environment of a long-term treated patientbioRxiv https://doi.org/10.1101/2021.05.14.444257
- 63.Mutational effects and the evolution of new protein functionsNat Rev Genet 11:572–582
- 64.HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignmentNat Methods 9:173–175
- 65.CCMpred--fast and precise prediction of protein residue-residue contacts from correlated mutationsBioinformatics 30:3128–3130
- 66.The igraph software package for complex network research
- 67.Detecting community structure in networksThe European Physical Journal B - Condensed Matter 38:321–330
- 68.Finding community structure in networks using the eigenvectors of matricesPhys Rev E Stat Nonlin Soft Matter Phys 74
- 69.PlayMolecule ProteinPrepare: A Web Application for Protein Preparation for Molecular Dynamics SimulationsJ Chem Inf Model 57:1511–1516
- 70.ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SBJ Chem Theory Comput 11:3696–3713
- 71.Molecular dynamics with coupling to an external bathThe Journal of Chemical Physics 81:3684–3690
- 72.HTMD: High-Throughput Molecular Dynamics for Molecular DiscoveryJ Chem Theory Comput 12:1845–1852
- 73.ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time ScaleJ Chem Theory Comput 5:1632–1639
- 74.Langevin thermostat for rigid body dynamicsJ Chem Phys 130
Article and author information
Author information
Version history
- Preprint posted:
- Sent for peer review:
- Reviewed Preprint version 1:
Copyright
© 2025, Xu et al.
This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.
Metrics
- views
- 69
- download
- 1
- citations
- 0
Views, downloads and citations are aggregated across all versions of this paper published by eLife.