Functionally Important Residues from Graph Analysis of Coevolved Dynamic couplings

Manming Xu; Sarath Chandra Dantu; James A Garnett; Robert A Bonomo; Alessandro Pandini; Shozeb Haider

doi:10.7554/eLife.105005.2

eLife Assessment

This paper reports the analysis of coevolutionary patterns and dynamical information for identifying functionally relevant sites. These findings are considered important due to the broad utility of the unified framework and network analysis capable of revealing communities of key residues that go beyond the residue-pair concept. The data are solid and the results are clearly presented.

https://doi.org/10.7554/eLife.105005.2.sa4

Significance of findings

important: Findings that have theoretical or practical implications beyond a single subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

solid: Methods, data and analyses broadly support the claims with only minor weaknesses

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

The relationship between protein dynamics and function is essential for understanding biological processes and developing effective therapeutics. Functional sites within proteins are critical for activities such as substrate binding, catalysis, and structural changes. Existing computational methods for the predictions of functional residues are trained on sequence, structural and experimental data, but they do not explicitly model the influence of evolution on protein dynamics. This overlooked contribution is essential as it is known that evolution can fine tune protein dynamics through compensatory mutations, either to improve the proteins’ performance or diversify its function while maintaining the same structural scaffold. To model this critical contribution, we introduce DyNoPy, a computational method that combines residue coevolution analysis with molecular dynamics (MD) simulations, revealing hidden correlations between functional sites. DyNoPy constructs a graph model of residue-residue interactions, identifies communities of key residue groups and annotates critical sites based on their roles. By leveraging the concept of coevolved dynamical couplings—residue pairs with critical dynamical interactions that have been preserved during evolution—DyNoPy offers a powerful method for predicting and analysing protein evolution and dynamics. We demonstrate the effectiveness of DyNoPy on SHV-1 and PDC-3, chromosomally encoded β-lactamases linked to antibiotic resistance, highlighting its potential to inform drug design and address pressing healthcare challenges.

Introduction

Quantifying the contribution of individual residues or residue groups to protein function is important to estimate the pathogenic effect of mutations (1). Identifying the functional roles of individual residues has primarily been done through mutagenesis experiments (2). Bioinformatics methods have complemented these approaches through analysis of multiple sequence alignments (MSA) of homologous proteins and structural data (3–8). Among these methods, computational techniques that can decode inter-residue evolutionary relationships from MSAs have paved the way for machine learning (ML) based strategies that can predict protein structure (9–12), stability (13), and function (7) and extend the scope of computational protein design (14–16). A most recent approach has combined experimental data from three proteins, NUDT15, PTEN and CYP2C9, on stability and function with sequence and structural features to train a ML model to predict functional sites (17).

Functional sites are often regulated by both, local and global interactions. Changes in these interactions are instrumental for functional events like substrate binding, catalysis, and conformational changes (18). The development of physical models of protein dynamics and the increase in available computational power has stimulated the adoption of computational techniques (19, 20) to investigate the conformational dynamics of proteins, an essential component of the many biological functions (21, 22). Different models have been proposed to describe the interactions between residues during simulations and network models have been particularly popular, including methods on single structures and MD simulations data built by analysing the response to external forces on residue networks (23), by estimating the prevalence of non-covalent energy interaction networks in homologous proteins (24), or by analysing linear or non-linear correlation in atomic fluctuations (25, 26). These techniques have demonstrated their usefulness in extracting allosteric networks from structural data with applications in enzyme design (26).

However, none of these techniques incorporate information on residue evolution into the computational approach, While it has been established that evolution through compensatory mutations in dynamic regions, like hinges and loops, can fine tune protein structural dynamics and introduce promiscuity, thereby diversifying biological function. Assuming that protein functional dynamics is conserved during evolution, significant information on dynamic regions and substrate recognition sites should be recoverable using inter residue coevolution scores extracted from MSAs (27, 28). Coevolution analysis and Molecular Dynamics (MD) simulations have independently (29) and synergistically been combined in the past to identify important residues for function (30–34). Yet a method that combines hidden information on dynamics from evolution with direct information on local and global dynamics from conformational ensembles from MD is not yet available.

Here, we present DyNoPy, a computational method that can extract hidden information on functional sites from the combination of pairwise residue coevolution data and powerful descriptors of dynamics extracted from the analysis of MD ensembles. The method can detect coevolved dynamic couplings, i.e. residue pairs with critical dynamical interactions that have been preserved during evolution. These pairs are extracted from a graph model of residue-residue interactions. Communities of important residue groups are detected, and critical sites are identified by their eigenvector centrality in the graph (Fig. 1). We demonstrate the power of this approach on SHV-1 and PDC-3 β-lactamases of major clinical importance (35, 36). DyNoPy successfully detects residue couplings that align with previous studies, guide in the explanations of mutation sites with previously unexplained mechanisms and provide predictions on plausible important sites for the emergence of clinically relevant variants.

Results and Discussion

β-lactamases are a group of enzymes capable of hydrolysing β-lactams, conferring resistance to β-lactam antibiotics (37). These enzymes are evolving rapidly, as single amino acid substitutions are sufficient to drive their evolution and increase their catalytic spectrum and inhibitor resistance profile (38). The widespread dissemination of β-lactamases across different bacterial species and their extensive emergence highlight their global impact on antibiotic resistance (39). The rapid evolution of β-lactamases and their clinical significance (38) makes them an ideal target for evaluating the robustness of DyNoPy.

In this study, we applied DyNoPy to two model enzymes from different β-lactamase families: class A β-lactamase SHV-1 (a chromosomally encoded enzyme in Klebsiella pneumoniae) and class C β-lactamase PDC-3 (a chromosomally encoded enzyme in Pseudomonas aeruginosa) (35, 36) (Supplementary Fig. S1 and S2). Both class A and class C β-lactamases comprise an α/β domain and an α helical domain, with the active site situated in between (40, 41). Moreover, both enzymes target the carbonyl carbon of the β-lactams using a highly conserved serine residue (42, 43). Despite these similarities, the structures of class A and class C β-lactamases are remarkably different (Fig. 2).

Structural Comparison of SHV-1 (PDB ID: 3N4I) and PDC-3 (PDB ID: 4HEF) β-Lactamases. Catalytic serine S₇₀ (SHV-1) and S₆₄ (PDC3) are highlighted using stick representation. Important loops surrounding the active site are highlighted in red. In SHV-1, highlighted loops are the α3-α4 loop (residues 101-111), the Ω-loop (residues 164-179), and the hinge region (residues 213-218). In PDC-3, highlighted loops are the Ω-loop (residues 183-226) and the R2-loop (residues 280-310).

SHV-1 is a very well characterised enzyme with wealth of information on mutations and their corresponding effects on protein function. In contrast, the information available on PDC-3 remains limited. Detailed structural information on these enzymes can be found in the supplementary materials. Essential catalytic residues in SHV-1 are: S₇₀, K₇₃, S₁₃₀, E₁₆₆, N₁₇₀, K₂₃₄, G₂₃₆, and A₂₃₇ (44) and conserved catalytic residues in PDC-3 include S₆₄, K₆₇, Y₁₅₀, N₁₅₂, K₃₁₅, T₃₁₆, and G₃₁₇. Highly conserved stretches of 3-9 hydrophobic residues, annotated as hydrophobic nodes, exists in class A β-lactamases and have been proven to be essential for protein stability (45). Residues defined as belonging to hydrophobic nodes within SHV-1 are listed in Supplementary Table S1.

In SHV-1, the predominant extended spectrum β-lactamase (ESBL) substitutions occur at L₃₅, G₂₃₈, and E₂₄₀, while R₄₃, E₆₄, D₁₀₄, A₁₄₆, G₁₅₆, D₁₇₉, R₂₀₂, and R₂₀₅ appear in ESBLs with lower frequency (46). Mutations at M₆₉, S₁₃₀, A₁₈₇, T₂₃₅, and R₂₄₄ are known to induce inhibitor resistance in the enzyme (47). In PDC-3, substitutions primarily occur on the Ω-loop, enhancing its flexibility to accommodate the bulky side chains of antibiotics, while deletions are more common in the R2-loop (43). The predominant Ω-loop mutations isolated from clinics are found at positions V₂₁₁, G₂₁₄, E₂₁₉, and Y₂₂₁ (48).

Emergence of highly conserved dynamic couplings

DyNoPy builds a pairwise model of conserved dynamic couplings detected by combining coevolution scores and information on functional motions into a score J_ij (see Methods and Fig. 1). To this end a dynamics descriptor should be selected. When the descriptor is associated with functional conformational changes, it is expected that functionally relevant couplings will report higher scores. Dynamics descriptors can be selected from commonly used geometrical collective variables (CVs) for the analysis of MD trajectories (see Methods). As expected, the average J matrix score varies across the different CVs, with some of them showing no signal of dynamic coupling (Supplementary Fig. S4C).

SHV-1 and PDC-3 exhibit distinct dynamics, requiring a different choice of the CV that best captures the functional dynamics. For SHV-1, the global first principal component (PC1) proved to be the most effective feature, identifying 571 residue pairs with a J_ij value greater than 0. Conversely, PDC-3 requires selection of more localized features that can extract the Ω-loop dynamics from the overall protein motion. Among the dynamic descriptors, the partial first time-lagged component (TC1_partial) performed best for PDC-3, detecting 216 residue pairs with a J_ij value greater than 0. Consequently, PC1 and TC1_partial were selected to build the J matrix for SHV-1 and PDC-3, respectively. The performance of all 12 CVs for each protein was assessed and listed in the Supplementary Table S2.

The importance of dynamical information is evident when coevolution couplings (γ_ij) and conserved dynamic couplings (J_ij) are compared: the number of non-zero couplings decrease from 40% to <2% of total residue pairs in the protein (Supplementary Fig. S4D) when information from the dynamics descriptor is added. Thus, the inclusion of protein dynamics in coevolution studies acts as an effective filter that rules out residue pairs that do not have significant correlations with functional motions. Moreover, when relying only on γ_ij, all the residues in SHV-1 and PDC-3 are included within four identified communities (Supplementary Table S3), suggesting that coevolution scores (γ_ij) alone do not effectively discriminate residues relevant for protein functions. Furthermore, it would be hard to distinguish critical core residues for each community using only γ_ij, as the eigenvector centrality (EVC) values for the residues do not show remarkable differences (Supplementary Fig. S5A and S5B). This means that detailed dynamic investigation of the top residues is needed to determine which pairs should be picked up and further analysed. On the other hand, it is much easier to identify essential residues based on J scores calculated, as clear outliers with significantly higher EVC values could be seen for almost all communities (Supplementary Fig. S5C and S5D) (29, 49). In conclusion, the lack of specificity in the statistically based coevolution analysis supports the choice of incorporating a score for the correlation between residue interactions and dynamic behaviours that enables deconvolution of community information.

DyNoPy reveals critical residues and predicts evolutionary pathways in SHV-1

DyNoPy identified eight meaningful communities, each consisting of at least three strongly coupled residues within SHV-1 (Supplementary Fig. S4A). All crucial catalytic residues and critical substitution sites previously mentioned participating in one of these communities with the exceptions of R₄₃, R₂₀₂, and S₁₃₀. Residues previously known to have critical role in function or conferring ESBLs/IRBLs phenotype are either directly coupled to protein dynamics or act as a central hub. The hubs interact with residues with either a role in catalysis or structural stability through their membership of hydrophobic nodes (35). Furthermore, DyNoPy identified key positions (L₁₆₂ and N₁₃₆) within some communities that are known to undergo substitutions, conferring an ESBL phenotype in other class A β-lactamases. These substitutions have not yet emerged in the SHV family, providing insightful predictions about the potential future evolution of the enzyme. Detailed description of communities with secondary importance for protein function (community 3, 8, and 9) is provided in the supplementary information (Supplementary Fig. S6).

DyNoPy predicts mutation hotspots in SHV-1

DyNoPy detects critical mutation sites (L₁₆₂ and N₁₃₆) that are known to extend the range of substrates in other class A β-lactamases but have not yet emerged as variants in the SHV family. These sites have not been modified in SHV family because of their plausible central role within the communities as they are mediating couplings with key functional residues essential for catalytic activity and structural stability, indicating their critical role in protein function and the potential lower mutation rate. These findings provide insightful predictions about the potential future evolution of the enzyme, as well as plausible explanations for why these mutations have not yet appeared.

L_162, positioned at the start of the Ω-loop and adjacent to the crucial catalytic residue E_166, is assigned as the core residue for community 1 (Fig. 3A). While it remains conserved in SHV family, variants of L₁₆₂ have been isolated in other class A β-lactamase and are known to expand the enzyme catalytic spectrum. Single amino acid substitution at L₁₆₂ can intensify antibiotic resistance in BEL-1 (50), a class A ESBL clinical variant, exhibiting robust resistance to ticarcillin and ceftazidime (51). BEL-2 diverges from BEL-1 by single amino acid substitution (L₁₆₂F) which alters the kinetic properties of the enzyme significantly and increases its affinity towards expanded-spectrum cephalosporins (52). The relationship between L₁₆₂ and protein catalytic functions can be explained using DyNoPy model, as there are couplings with catalytic important residues M₆₉, K₇₃, E₁₆₆, and K₂₃₄. Moreover, the BEL case has confirmed that L₁₆₂F mutation significantly destabilizes the overall protein structure, highlighting the crucial role of L₁₆₂ in maintaining protein stability (50). DyNoPy accurately identifies the centrality of L₁₆₂ by reporting its connections with 28 backbone residues, including nine hydrophobic node residues critical for protein stability. Among these, five hydrophobic residues are part of the α2 node: V₇₅, L₇₆, G₇₈, V₈₀, and L₈₁, highlighting the contribution of L₁₆₂ to the stability of the α2 helix (35).

Community 1, 4, and 5 of SHV-1 β-Lactamase. All the residues are depicted as spheres on the protein structure. The core residue for each community is highlighted in red, while purple is used to emphasize the secondary core residue. Residues that interact with both cores are coloured in light yellow. Functional important residues are marked in cyan. Hydrophobic nodes are enclosed with cyan boxes. A. Community 1 of SHV-1, comprising 33 residues with L₁₆₂ being the primary core residue. B. Community 4 of SHV-1, containing 12 residues and is centred by G₁₅₆. G₁₅₆ and A₁₄₆ are two functional important residues distant from the active site. G₁₅₆ is 21.3L away from the catalytic S₇₀. A₁₄₆ is 16.8L away from S₇₀. C. Community 5 of SHV-1, embracing 48 residues and showing a strong correlation between V₁₀₃ and S₁₀₆.

Just like L₁₆₂, N₁₃₆ undergoes advantageous mutations in other class A β-lactamases while remains highly conserved within the SHV family. It is the core residue for community 7 (Fig. 4B). This residue forms a hydrogen bond with E₁₆₆, stabilizing the Ω-loop (53). Although DyNoPy did not detect this direct interaction between N₁₃₆ and E₁₆₆, the established relationship between N₁₃₆ and N₁₇₀ highlights the role of N₁₃₆ in influencing E₁₆₆. N₁₇₀, an essential catalytic residue located on the Ω-loop, contributes to priming the water molecule for the deacylation step with E₁₆₆ (54) and is directly coupled with N₁₃₆. Due to the essential contribution of N₁₃₆ in facilitating E₁₆₆ to maintain its proper orientation, it was previously thought to be intolerant to mutations as substitution of Asparagine to Alanine at this position would make the enzyme lose its function completely (55). However, N₁₃₆D substitution has emerged as a new clinical variant very recently in PenL, a class A β-lactamase, by increasing its ability in hydrolysing ceftazidime (55), suggesting that this site has potential to mutate. This gain of function is mainly triggered by the increased flexibility of the Ω-loop (55). DyNoPy correctly detect a dynamical relationship between N₁₃₆ and the Ω-loop (residues 164-179). Six residues present in the Ω-loop participate within this community, including R₁₆₄ and D₁₇₉. These two residues are critical as they are forming the ‘bottleneck’ of the Ω-loop which is essential for the correct position of E₁₆₆ (56). D₁₇₉ is also a critical mutation site for SHV-1. Single amino acid substitutions like D₁₇₉A, D₁₇₉N, and D₁₇₉G are enough for the extended spectrum phenotype (46).

Community 6 and 7 of SHV-1 β-Lactamase. All the residues are depicted as spheres on the protein structure. The core residue for each community is highlighted in red, while purple is used to emphasize the secondary core residue. Residues that interact with both cores are coloured in light yellow. Functional important residues are marked in cyan. A. Community 6 of SHV-1, comprising 30 residues with Y₁₀₅ being the primary core residue. R₂₀₅ is a functional important residue that is 20.6L away from the active site S₇₀. B. Community 7 of SHV-1, containing 34 residues and is centred by N₁₃₆.

DyNoPy detects residue couplings essential for protein stability

DyNoPy identifies residue couplings critical for protein functional motions, particularly associated with protein stability. These residue pairs exhibit strong relationships as they are not only directly coupled with each other, but also forms various indirect couplings via other residues. As a result, both residues are considered as core residues inside these communities. It is expected that disruption of these couplings through mutation could compromise collective motions essential for enzyme activity.

As the secondary core residues in community 1 (Fig. 3A) F₇₂ is showing a strong coupling with the primary core residue L₁₆₂ and also forms nine indirect couplings with L₁₆₂, including via the catalytic K₂₃₄. This network of direct and indirect relationships reveals the importance of F₇₂ and L₁₆₂ coupling in maintaining protein functional motions. Interestingly, previous studies identified a small hydrophobic cavity formed by L₁₆₂ and F₇₂, together with L₁₃₉, and L₁₄₈, which is essential for the stability of the active site (50). Notably, DyNoPy successfully recovers the key residues of this local hydrophobic cavity (L₁₆₂, F₇₂, and L₁₄₈).

The strong interplay between V₁₀₃ and S₁₀₆, which are both residues on the α3-α4 loop, is seen in community 5 (Fig. 3C). These residues not only interact with each other directly but are also indirectly coupled via 22 other residues. This community emphasizes the significance of hydrophobic nodes in SHV stability and dynamics. Within the analysed 48 residues, 27 are hydrophobic, out of which 15 residues act as nodes critical for enzyme stabilization. Hydrophobic nodes stabilize their own secondary structures and interconnect to stabilize the overall protein (57). V₁₀₃ and S₁₀₆ themselves are hydrophobic nodes, stabilizing α3 helix and α4 helix respectively, and are strongly coupled with each other. In CTX-M, another class A enzyme, N₁₀₆S is a common substitution that results in improved thermodynamic stability and compensate for the loss in stability of the variants (58). Interestingly, this residue is already a Serine in SHV, but still implies its pivotal role in protein stability.

DyNoPy provides valid explanations for mutation sites

During the evolution of β-lactamases, single mutations on specific sites that are distant from the functional sites have been observed to significantly alter protein catalytic functions. Additionally, single mutations on some surface exposed residues can dramatically increase protein stability. Understanding how these distant mutations impact function and stability becomes a major challenge in understanding protein evolutionary pathways. Communities extracted by DyNoPy show these residues linked with functional important residues, providing a rational for these mutation sites with unknown functions.

Mutations of G₁₅₆ are limited but they lead to ESBL phenotype in the SHV family (46). G₁₅₆ is the central residue for community 4 (Fig. 3B), but it is distant from the active site, over 20L away from the catalytic serine S₇₀. Clinical variant SHV-27, has extended resistance ability towards cefotaxime, ceftazidime, and aztreonam (59). It differs from SHV-1 by single amino acid substitution G₁₅₆D, suggesting that it has directly evolved from SHV-1 (59). Limited research has been done on position G₁₅₆, and the understanding of how it affects the enzyme catalytic properties given that it is far away from the active site is still unclear. Based on our results, we suggest that this residue is essential for the overall protein function because of its 11 coevolved dynamic couplings with protein dynamics, including A₁₄₆, another ESBL substitution site.

SHV-38, another ESBL that is capable of hydrolysing carbapenems, harbours a single A₁₄₆V substitution compared to SHV-1 (60). Like G₁₅₆, A₁₄₆ is 16.8 L away from S₇₀ but shows an ability in altering protein catalytic function. The A₁₄₆-G₁₅₆ residue pair shows a strong coevolutionary signal and strong correlation with protein overall dynamics, implying that there may compensatory mutations at these sites with potential to emerge in the SHV family in the future. These two residues are not connected to any catalytic residues but their coupling to functional dynamics can offer plausible explanation to ESBL activity of these two mutations.

Unlike other substitution sites that are adjacent to the active site, R₂₀₅ is situated more than 20 L away from catalytic serine S₇₀. Its side chain points outward from the protein, exposing to the solvent. The R₂₀₅L substitution often co-occurs with other ESBL mutations and is thought to indirectly contribute to the ESBL phenotype by compensating for stability loss induced by other mutations (61). SHV-3 is an ESBL that exhibits significant resistance to cefotaxime and ceftriaxone (62). Two substitutions in this enzyme, R₂₀₅L and G₂₃₈S, extend its resistance profile (62). Thus, it is promising to see that DyNoPy detected these two mutation sites together within community 6 (Fig. 4A).

Y₁₀₅ and R₂₆₆ are the core residues for community 6. Y₁₀₅ is situated on the α3-α4 loop positioned at the left side of the binding pocket. It is an important catalytic residue that recognizes and binds to the thiazolidine ring of penicillins or β-lactamase inhibitors (63). There is very limited information on the role of R₂₆₆, except that it may stabilize the Ω-loop in the SHV family similar to the analogous T₂₆₆ in TEM (64). G₂₃₈ is coupled with an essential catalytic residue Y₁₀₅, which further links with other catalytic functional residues: S₇₀ and A₂₃₇, and R₂₆₆, a residue that known to stabilize the Ω-loop. This indicates that mutations on G₂₃₈ would result in an alteration on protein catalytic function, as well as an increased flexibility of the protein, which strongly aligns with previous finding (62). Its linked mutation site R₂₀₅ does not showing direct coupling with any catalytic residues. Instead, it is directly coupled with R_266, which we mentioned as an Ω-loop stabilizer. Thus, it is not surprising that R₂₀₅ substitution alone is never observed in nature (65), as it would not give significant evolutionary advantage to the protein.

Insights into unexplained functional sites of PDC-3

Unlike the extensively studied SHV-1, the functional roles of individual amino acids in PDC-3 remains largely unexplored. This gap in understanding serves as welcome challenge for interpreting the effects of mutations and the dynamic behaviour of PDC-3 from our results. Although several mutation hotspots, such as those on the Ω-loop (48), have been identified, very little is known about the specific contributions of individual amino acids on the functionality of PDC-3.

In PDC-3, mutations have primarily been reported in the Ω-loop. They enhance its flexibility to accommodate the bulky side chains of antibiotics, while deletions are more common in the R2-loop (43). DyNoPy detected five communities in total (Supplementary Fig. S4B) with all the four predominant Ω-loop mutations appeared in these communities. Community 3, 4 and 5 are explained in the Supplementary information (Fig. S7). Furthermore, DyNoPy also detected several previously unexplored Ω-loop residues.

G₂₁₄, a known mutation site in PDC-3, is the core residue in community 1. Another two essential mutation sites: E₂₁₉ and Y₂₂₁ also participate in this community, directly coupled with G₂₁₄ (Fig. 5A). G₂₁₄ also has direct couplings with four other Ω-loop residues: A₁₉₅, A₁₉₇, G₂₁₂, and L₂₁₆. Previous results have demonstrated that substitutions of Glycine to Alanine or Arginine at 214 significantly destabilizes the Ω-loop (36). The strong correlation between G₂₁₄ and these Ω-loop residues emphasizes the significant contribution of G₂₁₄ towards the stability of the Ω-loop, which corroborates with previous results (36). Moreover, substitutions such as G₂₁₄A and G₂₁₄R and mutations on E₂₁₉ and Y₂₂₁ do not affect R2 loop flexibility, resulting in the smaller active site volume among variants (36) because none of the residues from the R2 loop are detected in this community offering plausible explanation to previously unexplained phenomenon.

Community 1 and 2 of PDC-3 β-Lactamase. All the residues are depicted as spheres on the protein structure. The core residue for each community is highlighted in red. Functional important residues are marked in cyan. A. Community 1 of PDC-3, comprising 36 residues with G₂₁₄ being the primary core residue. B. Community 2 of PDC-3, containing 74 residues and is centred by G₂₀₄.

G₂₀₄ is the core residue of community 2, coupled with 73 other residues, most of which are distant from the catalytic site, suggesting plausible crucial role in overall protein stability like L₁₆₂ in SHV-1 (Fig. 5B). G₂₀₄, a newly emerged mutation site in the PDC family (66), is located on the short β-sheet β5a within the Ω-loop, near the hinge region between β8 and β9 just above the active site. The only known variant of G₂₀₄ is PDC-466, which was derived from PDC-462 (A₈₉V, Q₁₂₀K, V₂₁₁A, N₃₂₀S), with an addition of G₂₀₄D (66). Coupling of G₂₀₄ to several catalytically important residues, including K₆₇, K₃₁₅, and T₃₁₆ can suggest that mutations at this site can negatively impact catalytic power. This offers a plausible explanation of seeing fewer variants at this site and mutations at this site could have impact on hydrolysing capabilities of PDC variants. This should be confirmed by further experimental studies of variants of G₂₀₄. Unlike G₂₁₄, E₂₁₉ and Y₂₂₁ mutations which do not influence the dynamics of the R2 loop, substitutions on V_211, a member of Ω-loop, has impact on dynamics of R2 loop because of its indirect couplings, through G₂₀₄ to R2-loop residues (36). Two less critical substitution sites, H₁₈₈ and V₃₂₉, were also observed in community 2.

Conclusions

DyNoPy offers two distinct advantages over existing computational tools (24, 26): a) information on residue-residue coevolution can be directly used to detect the components of protein dynamics that have been preserved during evolution b) dynamic descriptors extracted from the MD ensembles can be used to identify the function-specific conserved dynamic couplings. These couplings are then easily modelled as a graph and network analysis is used to extract epistatic communities and assign roles to residues based on their importance in the graph model. The choice of a relevant descriptor of functional dynamics has an impact on the ability to detect couplings that are involved in functional dynamics. In systems where there is limited role of dynamics in the function, the analysis done with DyNoPy is equivalent to conventional coevolution analysis, which can be consider one limitation of our method.

Here we demonstrated how the choice of relevant global and local descriptors returns a higher number of effective couplings (greater than 0), and in turn leads to interpretable graph models and communities. In other systems, when multiple descriptors can be used to quantify functional conformational change, it is expected that they will differently modulate the effect of coevolution coupling, which will be reflected in a different structure of the associated graph models. This suggests the use of DyNoPy to generate comparative models in proteins with multiple functions associated to distinct dynamical changes.

Mutations of L₁₆₂ and N₁₃₆ have not yet emerged in SHV-1, but they are detected by DyNoPy as core residues for communities. These residues are strongly coupled with other functional important residues, which play critical roles in protein stability and catalytic activity. The identification of these couplings shows high consistency with previous studies and highlights the importance of L₁₆₂ and N₁₃₆ in SHV-1 functional dynamics. Given their central role in these communities, mutations in L₁₆₂ and N₁₃₆ can significantly alter protein function, suggesting their potential for future evolutionary changes. However, their strong relationships with these critical functional residues also suggest that mutation at these sites would need to be balanced to maintain protein function, providing an explanation for why such mutations have not yet emerged in SHV-1 (67). The ability of DyNoPy in detecting functionally important mutation sites was demonstrated via well-characterized mutation sites including R₂₀₅ and G₂₃₈ from SHV-1. Moreover, DyNoPy shows predictive ability on less-studied mutation sites such as G₁₅₆ and A_146, by detecting critical residue couplings that coevolved with functional motions.

Based on the knowledge we have gained from analysis of SHV-1 functional protein dynamics we suggest that in PDC-3, mutations at G₂₀₄ because of its significant conserved dynamic couplings can lead to new ESBL/IRBL clinical variants. We suggest that DyNoPy can be used as a predictive tool to identify potential functional residues within this enzyme and guide future mutagenesis studies.

In summary, by integrating hidden evolutionary information with direct dynamic interactions, DyNoPy provides a powerful framework for identifying and analysing functional sites in proteins. The tool not only identifies key residues involved in local and global interactions, but also improves our ability to predict silent residues with previously unknown roles for future experimental testing. Our application of DyNoPy to broad-spectrum β-lactamases ESBLs and IRBLs demonstrates its potential to address key medical challenges such as antibiotic resistance by providing valid predictions on protein evolution.

Methodology

DyNoPy generates a graph representation of the protein structure that captures the couplings between amino acid residues contributing to the functional dynamics of the protein. Residues are represented as graph nodes, and conserved dynamic couplings are recorded as edges. Edge weights quantify the strength of these couplings. The model is built on two assumptions: residue pairs should have i) coevolved and their ii) time-dependent interactions correlate with a functional conformational change.

Therefore, edge weights (J_ij) for residue i and j are calculated as:

where γ_ij is the scaled coevolution score and ρ_ij is the degree of correlation with the selected functional conformational change. α and β are weights assigned to γ_ij and ρ_ij that have a sum of one. The relative weight of the scaled coevolution score (α) is set to 0.5 in this study. When either of the assumptions listed above is not met, J_ij is set to zero.

Scaled coevolution scores

The occurrence of residue-residue coevolution can be estimated and quantified using probabilistic models of correlated mutations from deep multiple sequence alignments (MSA). DyNoPy supports generation of the MSA using the HH-Suite package (68) and calculation of scaled coevolution score (γ_ij) using CCMpred (69) as per the protocol described in Bibik et al. (70). For SHV-1 and PDC-3 hhblits returned 18,174 sequences (N_eff: 11.082) and 27,892 sequences (N_eff: 9.951). Sequences were detected from the UniRef30 (v2022_02) database (71). First a pairwise residue coevolution matrix (C) is calculated, then these raw scores (C_ij) are divided by the matrix mean (Equation 2). All scores (S_ij) smaller than 1 are set to zero, and the remaining values are normalised by the maximum value (Equation 3):

Correlation with functional motions

The contribution of a residue pair to a selected functional motion is estimated by how much the change in interaction energy between the two residues over time is correlated with a collective variable (CV) describing the functional motion:

where ε_ij(t) is the pairwise non-bonded interaction energy (see details in Supplementary Information) and d(t) is the time-dependent value of the CV. Examples of CV and a discussion on the choice of the most relevant CV is presented in the results section. Correlation values smaller than 0.5 are set to 0. In absence of detectable contributions to the functional dynamics of the system, the couplings extracted by DyNoPy will describe a pure evolutionary model, and the community detection method presented below will be equivalent to a direct decomposition of the residue coevolution network into units.

Graph representation and analysis of conserved dynamic couplings

All pairwise conserved dynamic couplings (Equation 1) are collected into a square matrix J. A graph is built from J, using python-igraph v0.11 library (72). Nodes represent residues, and edges are drawn between nodes with positive J_ij. Edge weights are set to J_ij. The relative importance of the residues in this model of protein dynamics is calculated as eigenvector centrality of the nodes (73). Residues involved in extensive correlated dynamics with other highly connected residues have higher eigenvector centrality (EVC) scores. Groups of residues contributing to important collective motions are detected by community analysis of the graph structure. The Girvan-Newman algorithm is used to extract the community structure (74). A meaningful community should contain at least three residues. Applying network analysis on the combined dynamics-coevolution matrix helps us extracting higher-order interactions beyond pairwise coupling and detecting critical residues, which show multiple interactions with each other. Moreover, indirect long-range relationships, which would be hard to identify from numerical data, could be detected through community clustering. Community-based analysis offers a more comprehensive understanding of residue relationships and enables the visualization of residue couplings on the protein structure.

Adaptive Sampling Molecular Dynamics Simulations

MD simulation data was sourced from our previous studies (35, 36). To summarise, SHV-1 structural coordinates (PDB ID: 3N4I) were obtained from the Protein Data Bank and modified to the wild type by introducing the E104D mutation. Similarly, the PDC-3 structure was derived from PDC-1 (PDB ID: 4HEF) by a T105A substitution. Both enzymes were protonated at pH 7.0 using PropKa from the PlayMolecule platform (75). One disulfide bond between C₇₇ and C₁₂₃ was specified in SHV-1. Both structures were solvated with TIP3P water molecules in a periodic box with a box size of 10 Å. Ions were added to neutralize the overall charge of each system at 150mM KCl. Amber force field ff14SB was used for all MD simulations (76). After an initial minimisation of 1000 steps, both the enzymes were equilibrated for 5 ns in the NPT ensemble at 1 atmospheric pressure using the Berendsen barostat (77). The initial velocities for each simulation were sampled from the Boltzmann distribution at 300 K. Multiple Markov State Model (MSM)-based adaptively sampled simulations were performed for both proteins based on the ACEMD engine (78, 79). A canonical (NVT) ensemble with a Langevin thermostat (80) (damping coefficient of 0.1 ps−1) and a hydrogen mass repartitioning scheme were employed to achieve time steps of 4 fs. For SHV-1, each trajectory spanned 60 ns with a time step of 0.1 ns, with a total of 593 trajectories. In the case of PDC-3, 100 trajectories were collected, each containing 3000 frames, lasting 300 ns. To manage the extensive datasets efficiently, trajectories were strategically stridden to ensure that a minimum of 30,000 frames were preserved for each system. The resulting trajectories are summarized in Supplementary Table S4.

Calculation and Selection of Collective Variables

DyNoPy works on the assumption that time-dependent interactions between critical residues, either having significant structural change or not will correlate with functional conformational motions. Since MD simulation data is high-dimensional, a time-dependent collected variable (CV) is required to extract the most relevant information for the process under study. The usefulness of DyNoPy is dependent on the choice of the CVs. To guide the selection of CVs, we selected 12 distinct features: radius of gyration (R_g), the first principal component (PC1), partial PC1 (PC1_partial), the first time-lagged independent component (TC1), partial TC1 (TC1_partial), global root mean square deviation (gRMSD), partial RMSD (pRMSD), dynamical RMSD (dRMSD), global solvent accessible surface area (gSASA), partial SASA (pSASA), active site pocket volume, and the number of hydrogen bonds (hbond). A description of the CVs, including the calculation methods and the residues used to calculate the partial variables, is detailed in the Supplementary information. CVs were subsequently used as input features for DyNoPy. A good collective variable (CV) should appropriately describe protein functional motions. Thus, a CV that detects the highest number of residue couplings is expected to be the most suitable descriptor. The length of the MD simulations should be appropriate to effectively sample the desired functional process as described by the selected CV.

Data Availability

All files required to run the simulations (topology, coordinates, input), processed trajectories (xtc), corresponding coordinates (pdb), can be downloaded from the DOI https://doi.org/10.57760/sciencedb.15876 (PDC-3) and 10.5281/zenodo.13693144 (SHV-1). DyNoPy is available at https://github.com/alepandini/DyNoPy.

Acknowledgements

SCD was supported by Leverhulme Trust grant RPG-2017-222 awarded to AP and JAG. The authors would like to thank Arianna Fornili for insightful suggestions on the design of DyNoPy methodology.

Additional files

Supplementary Information

References

1.
1. Stenson P. D.
2. et al.
2017The Human Gene Mutation Database: towards a comprehensive repository of inherited mutation data for medical research, genetic diagnosis and next-generation sequencing studiesHum Genet 136:665–677Google Scholar
2.
1. Matreyek K. A.
2. et al.
2018Multiplex assessment of protein variant abundance by massively parallel sequencingNat Genet 50:874–882Google Scholar
3.
1. Poelwijk F. J.
2. Krishna V.
3. Ranganathan R.
2016The Context-Dependence of Mutations: A Linkage of FormalismsPLoS Comput Biol 12:e1004771Google Scholar
4.
1. Høie M. H.
2. Cagiada M.
3. Beck Frederiksen A. H.
4. Stein A.
5. Lindorff-Larsen K.
2022Predicting and interpreting large-scale mutagenesis data using analyses of protein stability and conservationCell Rep 38:110207Google Scholar
5.
1. Blaabjerg L. M.
2. et al.
2023Rapid protein stability prediction using deep learning representationseLife 12Google Scholar
6.
1. Dunham A. S.
2. Beltrao P.
2021Exploring amino acid functions in a deep mutational landscapeMol Syst Biol 17:e10305Google Scholar
7.
1. Hopf T. A.
2. et al.
2017Mutation effects predicted from sequence co-variationNat Biotechnol 35:128–135Google Scholar
8.
1. Radivojac P.
2. et al.
2013A large-scale evaluation of computational protein function predictionNat Methods 10:221–227Google Scholar
9.
1. Jumper J.
2. et al.
2021Highly accurate protein structure prediction with AlphaFoldNature 596:583–589Google Scholar
10.
1. Lin Z.
2. et al.
2023Evolutionary-scale prediction of atomic-level protein structure with a language modelScience 379:1123–1130Google Scholar
11.
1. Baek M.
2. et al.
2021Accurate prediction of protein structures and interactions using a three-track neural networkScience 373:871–876Google Scholar
12.
1. Marks D. S.
2. Hopf T. A.
3. Sander C.
2012Protein structure prediction from sequence variationNat Biotechnol 30:1072–1080Google Scholar
13.
1. Broom A.
2. Trainor K.
3. Jacobi Z.
4. Meiering E. M.
2020Computational Modeling of Protein Stability: Quantitative Analysis Reveals Solutions to Pervasive ProblemsStructure 28:717–726Google Scholar
14.
1. Ding D.
2. et al.
2024Protein design using structure-based residue preferencesNat Commun 15:1639Google Scholar
15.
1. Wu Z.
2. Kan S. B. J.
3. Lewis R. D.
4. Wittmann B. J.
5. Arnold F. H.
2019Machine learning-assisted directed protein evolution with combinatorial librariesProc Natl Acad Sci U S A 116:8852–8858Google Scholar
16.
1. Russ W. P.
2. et al.
2020An evolution-based model for designing chorismate mutase enzymesScience 369:440–445Google Scholar
17.
1. Cagiada M.
2. et al.
2023Discovering functionally important sites in proteinsNat Commun 14:4175Google Scholar
18.
1. Wodak S. J.
2. et al.
2019Allostery in Its Many Disguises: From Theory to ApplicationsStructure 27:566–578Google Scholar
19.
1. Campitelli P.
2. Modi T.
3. Kumar S.
4. Ozkan S. B.
2020The Role of Conformational Dynamics and Allostery in Modulating Protein EvolutionAnnu Rev Biophys 49:267–288Google Scholar
20.
1. Rodrigues C. H. M.
2. Pires D. E. V.
3. Ascher D. B.
2021DynaMut2: Assessing changes in stability and flexibility upon single and multiple point missense mutationsProtein Sci 30:60–69Google Scholar
21.
1. Henzler-Wildman K.
2. Kern D.
2007Dynamic personalities of proteinsNature 450:964–972Google Scholar
22.
1. James L. C.
2. Tawfik D. S.
2003Conformational diversity and protein evolution--a 60-year-old hypothesis revisitedTrends Biochem Sci 28:361–368Google Scholar
23.
1. Nevin Gerek Z.
2. Kumar S.
3. Banu Ozkan S.
2013Structural dynamics flexibility informs function and evolution at a proteome scaleEvol Appl 6:423–433Google Scholar
24.
1. Yehorova D.
2. Crean R. M.
3. Kasson P. M.
4. Kamerlin S. C. L.
2024Key interaction networks: Identifying evolutionarily conserved non-covalent interaction networks across protein familiesProtein Sci 33:e4911Google Scholar
25.
1. Lange O. F.
2. Grubmuller H.
2006Generalized correlation for biomolecular dynamicsProteins 62:1053–1061Google Scholar
26.
1. Osuna S.
2020The challenge of predicting distal active site mutations in computational enzyme designWIREs Computational Molecular Science 11Google Scholar
27.
1. Granata D.
2. Ponzoni L.
3. Micheletti C.
4. Carnevale V.
2017Patterns of coevolving amino acids unveil structural and dynamical domainsProc Natl Acad Sci U S A 114:E10612–E10621Google Scholar
28.
1. Liu Y.
2. Bahar I.
2012Sequence evolution correlates with structural dynamicsMol Biol Evol 29:2253–2263Google Scholar
29.
1. Parente D. J.
2. Ray J. C.
3. Swint-Kruse L.
2015Amino acid positions subject to multiple coevolutionary constraints can be robustly identified by their eigenvector network centrality scoresProteins 83:2293–2306Google Scholar
30.
1. Ponzoni L.
2. Polles G.
3. Carnevale V.
4. Micheletti C.
2015SPECTRUS: A Dimensionality Reduction Approach for Identifying Dynamical Domains in Protein Complexes from Limited Structural DatasetsStructure 23:1516–1525Google Scholar
31.
1. Sutto L.
2. Marsili S.
3. Valencia A.
4. Gervasio F. L.
2015From residue coevolution to protein conformational ensembles and functional dynamicsProc Natl Acad Sci U S A 112:13567–13572Google Scholar
32.
1. Estabrook R. A.
2. et al.
2005Statistical coevolution analysis and molecular dynamics: identification of amino acid pairs essential for catalysisProc Natl Acad Sci U S A 102:994–999Google Scholar
33.
1. Chen Z.
2. Rappert S.
3. Sun J.
4. Zeng A. P.
2011Integrating molecular dynamics and co-evolutionary analysis for reliable target prediction and deregulation of the allosteric inhibition of aspartokinase for amino acid productionJ Biotechnol 154:248–254Google Scholar
34.
1. Wang J.
2. Zhao Y.
3. Wang Y.
4. Huang J.
2013Molecular dynamics simulations and statistical coupling analysis reveal functional coevolution network of oncogenic mutations in the CDKN2A-CDK6 complexFEBS Lett 587:136–141Google Scholar
35.
1. Olehnovics E.
2. et al.
2021The Role of Hydrophobic Nodes in the Dynamics of Class A beta-LactamasesFront Microbiol 12:720991Google Scholar
36.
1. Chen S.
2. et al.
2024Omega-Loop mutations control the dynamics of the active site by modulating a network of hydrogen bonds in PDC-3 beta-lactamasebioRxiv https://doi.org/10.1101/2024.02.04.578824 Google Scholar
37.
1. Poole K.
2004Resistance to beta-lactam antibioticsCell Mol Life Sci 61:2200–2223Google Scholar
38.
1. Bush K.
2018Past and Present Perspectives on beta-LactamasesAntimicrob Agents Chemother 62Google Scholar
39.
1. Bush K.
2013Proliferation and significance of clinically relevant beta-lactamasesAnn N Y Acad Sci 1277:84–90Google Scholar
40.
1. Matagne A.
2. Lamotte-Brasseur J.
3. Frere J. M.
1998Catalytic properties of class A beta-lactamases: efficiency and diversityBiochem J 330:581–598Google Scholar
41.
1. Philippon A.
2. Arlet G.
3. Labia R.
4. Iorga B. I.
2022Class C beta-Lactamases: Molecular CharacteristicsClin Microbiol Rev 35:e0015021Google Scholar
42.
1. Palzkill T.
2018Structural and Mechanistic Basis for Extended-Spectrum Drug-Resistance Mutations in Altering the Specificity of TEM, CTX-M, and KPC beta-lactamasesFront Mol Biosci 5:16Google Scholar
43.
1. Jacoby G. A.
2009AmpC beta-lactamasesClin Microbiol Rev 22:161–182Google Scholar
44.
1. Ambler R. P.
2. et al.
1991A standard numbering scheme for the class A beta-lactamasesBiochem J 276:269–270Google Scholar
45.
1. Galdadas I.
2. et al.
2018Defining the architecture of KPC-2 Carbapenemase: identifying allosteric networks to fight antibiotics resistanceSci Rep 8:12916Google Scholar
46.
1. Liakopoulos A.
2. Mevius D.
3. Ceccarelli D.
2016A Review of SHV Extended-Spectrum beta-Lactamases: Neglected Yet UbiquitousFront Microbiol 7:1374Google Scholar
47.
1. Pagan-Rodriguez D.
2. et al.
2004Tazobactam inactivation of SHV-1 and the inhibitor-resistant Ser130 -->Gly SHV-1 beta-lactamase: insights into the mechanism of inhibitionJ Biol Chem 279:19494–19501Google Scholar
48.
1. Barnes M. D.
2. et al.
2018Deciphering the Evolution of Cephalosporin Resistance to Ceftolozane-Tazobactam in Pseudomonas aeruginosamBio 9Google Scholar
49.
1. Negre C. F. A.
2. et al.
2018Eigenvector centrality for characterization of protein allosteric pathwaysProc Natl Acad Sci U S A 115:E12201–E12208Google Scholar
50.
1. Pozzi C.
2. et al.
2016Crystal Structure of the Pseudomonas aeruginosa BEL-1 Extended-Spectrum beta-Lactamase and Its Complexes with Moxalactam and ImipenemAntimicrob Agents Chemother 60:7189–7199Google Scholar
51.
1. Bogaerts P.
2. Bauraing C.
3. Deplano A.
4. Glupczynski Y.
2007Emergence and dissemination of BEL-1-producing Pseudomonas aeruginosa isolates in BelgiumAntimicrob Agents Chemother 51:1584–1585Google Scholar
52.
1. Poirel L.
2. et al.
2010BEL-2, an extended-spectrum beta-lactamase with increased activity toward expanded-spectrum cephalosporins in Pseudomonas aeruginosaAntimicrob Agents Chemother 54:533–535Google Scholar
53.
1. BolJs F.
2. Pleiss J.
2008Conserved water molecules stabilize the Omega-loop in class A beta-lactamasesAntimicrob Agents Chemother 52:1072–1079Google Scholar
54.
1. Agarwal V.
2. Yadav T. C.
3. Tiwari A.
4. Varadwaj P.
2023Detailed investigation of catalytically important residues of class A beta-lactamaseJ Biomol Struct Dyn 41:2046–2073Google Scholar
55.
1. Cao T. P.
2. et al.
2020Non-catalytic-Region Mutations Conferring Transition of Class A beta-Lactamases Into ESBLsFront Mol Biosci 7:598998Google Scholar
56.
1. Parwana D.
2. et al.
2024The Structural Role of N170 in Substrate-Assisted Deacylation in KPC-2 beta-LactamaseAngew Chem Int Ed Engl 63:e202317315Google Scholar
57.
1. Galdadas I.
2. et al.
2021Allosteric communication in class A beta-lactamases occurs via cooperative coupling of loop dynamicseLife 10Google Scholar
58.
1. Lu S.
2. et al.
2022An active site loop toggles between conformations to control antibiotic hydrolysis and inhibition potency for CTX-M beta-lactamase drug-resistance enzymesNat Commun 13:6726Google Scholar
59.
1. Corkill J. E.
2. Cuevas L. E.
3. Gurgel R. Q.
4. Greensill J.
5. Hart C. A.
2001SHV-27, a novel cefotaxime-hydrolysing beta-lactamase, identified in Klebsiella pneumoniae isolates from a Brazilian hospitalJ Antimicrob Chemother 47:463–465Google Scholar
60.
1. Poirel L.
2. et al.
2003Emergence in Klebsiella pneumoniae of a chromosome-encoded SHV beta-lactamase that compromises the efficacy of imipenemAntimicrob Agents Chemother 47:755–758Google Scholar
61.
1. Ben Achour N.
2. Mercuri P. S.
3. Ben Moussa M.
4. Galleni M.
5. Belhadj O.
2009Characterization of a novel extended-spectrum TEM-type beta-lactamase, TEM-164, in a clinical strain of Klebsiella pneumoniae in TunisiaMicrob Drug Resist 15:195–199Google Scholar
62.
1. Nicolas M. H.
2. Jarlier V.
3. Honore N.
4. Philippon A.
5. Cole S. T.
1989Molecular characterization of the gene encoding SHV-3 beta-lactamase responsible for transferable cefotaxime resistance in clinical isolates of Klebsiella pneumoniaeAntimicrob Agents Chemother 33:2096–2100Google Scholar
63.
1. Bethel C. R.
2. et al.
2006Role of Asp104 in the SHV beta-lactamaseAntimicrob Agents Chemother 50:4124–4131Google Scholar
64.
1. Kuzin A. P.
2. et al.
1999Structure of the SHV-1 beta-lactamaseBiochemistry 38:5720–5727Google Scholar
65.
1. Neubauer S.
2. et al.
2020A Genotype-Phenotype Correlation Study of SHV beta-Lactamases Offers New Insight into SHV Resistance ProfilesAntimicrob Agents Chemother 64Google Scholar
66.
1. Colque C. A.
2. et al.
2021Development of antibiotic resistance reveals diverse evolutionary pathways to face the complex and dynamic environment of a long-term treated patientbioRxiv https://doi.org/10.1101/2021.05.14.444257 Google Scholar
67.
1. Soskine M.
2. Tawfik D. S.
2010Mutational effects and the evolution of new protein functionsNat Rev Genet 11:572–582Google Scholar
68.
1. Remmert M.
2. Biegert A.
3. Hauser A.
4. Soding J.
2011HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignmentNat Methods 9:173–175Google Scholar
69.
1. Seemayer S.
2. Gruber M.
3. Soding J.
2014CCMpred--fast and precise prediction of protein residue-residue contacts from correlated mutationsBioinformatics 30:3128–3130Google Scholar
70.
1. Bibik P.
2. Alibai S.
3. Pandini A.
4. Dantu S. C.
2024PyCoM: a python library for large-scale analysis of residue-residue coevolution dataBioinformatics 40Google Scholar
71.
1. Mirdita M.
2. et al.
2017Uniclust databases of clustered and deeply annotated protein sequences and alignmentsNucleic Acids Res 45:D170–D176Google Scholar
72.
1. Csárdi G.
2. Nepusz T.
2006The igraph software package for complex network research
73.
1. Newman M. E. J.
2004Detecting community structure in networksThe European Physical Journal B - Condensed Matter 38:321–330Google Scholar
74.
1. Newman M. E.
2006Finding community structure in networks using the eigenvectors of matricesPhys Rev E Stat Nonlin Soft Matter Phys 74:036104Google Scholar
75.
1. Martinez-Rosell G.
2. Giorgino T.
3. De Fabritiis G.
2017PlayMolecule ProteinPrepare: A Web Application for Protein Preparation for Molecular Dynamics SimulationsJ Chem Inf Model 57:1511–1516Google Scholar
76.
1. Maier J. A.
2. et al.
2015ff14SB: Improving the Accuracy of Protein Side Chain and Backbone Parameters from ff99SBJ Chem Theory Comput 11:3696–3713Google Scholar
77.
1. Berendsen H. J. C.
2. Postma J. P. M.
3. van Gunsteren W. F.
4. DiNola A.
5. Haak J. R.
1984Molecular dynamics with coupling to an external bathThe Journal of Chemical Physics 81:3684–3690Google Scholar
78.
1. Doerr S.
2. Harvey M. J.
3. Noe F.
4. De Fabritiis G.
2016HTMD: High-Throughput Molecular Dynamics for Molecular DiscoveryJ Chem Theory Comput 12:1845–1852Google Scholar
79.
1. Harvey M. J.
2. Giupponi G.
3. Fabritiis G. D.
2009ACEMD: Accelerating Biomolecular Dynamics in the Microsecond Time ScaleJ Chem Theory Comput 5:1632–1639Google Scholar
80.
1. Davidchack R. L.
2. Handel R.
3. Tretyakov M. V.
2009Langevin thermostat for rigid body dynamicsJ Chem Phys 130:234101Google Scholar

Article and author information

Author information

Manming Xu
UCL School of Pharmacy, London, United Kingdom
- Joint first authors
Sarath Chandra Dantu
Department of Computer Science, Brunel University London, London, United Kingdom
ORCID iD: 0000-0003-2019-5311
- Joint first authors
James A Garnett
Centre for Host-Microbiome Interactions, Faculty of Dentistry, Oral & Craniofacial Sciences, King’s College London, London, United Kingdom
Robert A Bonomo
Research Service, Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Cleveland, United States, Department of Molecular Biology and Microbiology, Case Western Reserve University School of Medicine, Cleveland, United States, Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, United States, Clinician Scientist Investigator, Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Cleveland, United States, Departments of Pharmacology, Biochemistry, and Proteomics and Bioinformatics, Case Western Reserve University School of Medicine, Cleveland, United States, CWRU-Cleveland VAMC Center for Antimicrobial Resistance and Epidemiology (Case VA CARES), Cleveland, United States
Alessandro Pandini
Department of Computer Science, Brunel University London, London, United Kingdom, University of Tabuk (PFSCBR), Tabuk, Saudi Arabia
ORCID iD: 0000-0002-4158-233X
- For correspondence: alessandro.pandini@brunel.ac.uk
Shozeb Haider
UCL School of Pharmacy, London, United Kingdom, University of Tabuk (PFSCBR), Tabuk, Saudi Arabia, UCL Center for Advanced Research Computing, University College London, London, United Kingdom
ORCID iD: 0000-0003-2650-2925
- For correspondence: Shozeb.haider@ucl.ac.uk

Author Notes

Competing interests: No competing interests declared

Version history

Preprint posted: November 3, 2024
Sent for peer review: November 8, 2024
Reviewed Preprint version 1: January 16, 2025
Reviewed Preprint version 2: March 12, 2025
Version of Record published: March 28, 2025

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.105005. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 1,479
downloads: 102
citations: 2

Views, downloads and citations are aggregated across all versions of this paper published by eLife.