Functionally Important Residues from Graph Analysis of Coevolved Dynamic couplings

  1. UCL School of Pharmacy, London, United Kingdom
  2. Department of Computer Science, Brunel University London, London, United Kingdom
  3. Centre for Host-Microbiome Interactions, Faculty of Dentistry, Oral & Craniofacial Sciences, King’s College London, London, United Kingdom
  4. Research Service, Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Cleveland, United States
  5. Department of Molecular Biology and Microbiology, Case Western Reserve University School of Medicine, Cleveland, United States
  6. Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, United States
  7. Clinician Scientist Investigator, Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Cleveland, United States
  8. Departments of Pharmacology, Biochemistry, and Proteomics and Bioinformatics, Case Western Reserve University School of Medicine, Cleveland, United States
  9. CWRU-Cleveland VAMC Center for Antimicrobial Resistance and Epidemiology (Case VA CARES) Cleveland, Cleveland United States
  10. The Thomas Young Centre for Theory and Simulation of Materials, London, United Kingdom

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Yogesh Gupta
    The University of Texas Health Science Center at San Antonio, San Antonio, United States of America
  • Senior Editor
    Qiang Cui
    Boston University, Boston, United States of America

Reviewer #1 (Public review):

Summary:
As reported above, this paper by Xu et al reports on a new method to combine the analysis of coevolutionary patterns with dynamic profiles to identify functionally important residues and reveal correlations between binding sites.

Strengths:
In general, coevolutionary analysis and MD analysis are carried out separately and while there have been attempts to compare the information provided by the two, no unified framework exists. Here, the authors convincingly demonstrate that integrating signals from Dynamics and coevolution gives information that substantially overcomes the one provided by either method in isolation. While other methods are useful, they do not capture how dynamics is fundamental to define function and thus sculpts coevolution, via the 3D structure of the protein. At the same time, the authors demonstrate how coevolution in turn also influences internal dynamics. The Networks they rebuild unveil information at an even higher level: the model starts pairwise but through network representation the authors arrive to community analysis, reporting on interaction patterns that are larger than simple couples.

Weaknesses:
The authors should
-Make an effort in suggesting/commenting the limits of applicability of their method;
-Expand discussion on how DyNoPy compares to other methods;
-Dynamic is not essential in all systems (structural proteins): The authors may want to comment on possible strategies they would use for other systems where their framework may not be suitable/applicable.

Reviewer #2 (Public review):

Summary:
Authors introduced a computational framework, DyNoPy, that integrates residue coevolution analysis with molecular dynamics (MD) simulations to identify functionally important residues in proteins. DyNoPy identifies key residues and residue-residue coupling to generate an interaction graph and attempts to validate using two clinically relevant β-lactamases (SHV-1 and PDC-3).

Strengths:
DyNoPy could not only show clinically relevance of mutations but also predict new potential evolutionary mutations. Authors have provided biologically relevant insights into protein dynamics which can have potential applications in drug discovery and understanding molecular evolution.

Weaknesses:
Although DyNoPy could show the relevance of key residues in active and non-active site residues, no experiments have been performed to validate their predictions. In addition, they should compare their method with conventional techniques and show how their method could be different.

An explanation of "communities" divided in the work and how these communities are relevant to the article should be provided. In addition, choice of collective variables and their relevance in residue coupling movement is also not very well explained. Dynamics cross correlation map can also be a good method for understanding the residue movements and can explain the residue-residue coupling, it is not explained how DyNoPy is different from the conventional methods or can perform better.

In the sentence "DyNoPy identified eight significant communities of strongly coupled residues within SHV-1 (Supporting Fig. S4A)" I could not find a clear description of eight significant communities.

Again the description of communities is not clear to me in the following sentence "Detailed description of the other three communities is provided in the supporting information (Fig. S6)."

In the sentence "N170 acts as an intermediary between N136 and E166". Kindly cite the reference figure to show N179 as intermediate residue.

Please be careful with the numbers. In the sentence "These residues not only interact with each other directly but are also indirectly coupled via 21 other residues." I could count 22 other residues and not 21.

In the sentence "Unlike other substitution sites that are adjacent to the active site, R205 is situated more than 16 Å away from catalytic serine S70". Please add this label somewhere in the figure.

Please cite a reference in the sentence "This indicates that mutations on G238 would result in an alteration on protein catalytic function, as well as an increased flexibility of the protein, which strongly aligns with previous finding."

Reviewer #3 (Public review):

Summary:
In this paper, Xu, Dantu and coworkers report a protocol for analyzing coevolutionary and dynamical information to identify a subset of communities that capture functionally relevant sites in beta-lactamases.

Strengths:
The combination of coevolutionary information and metrics from MD simulations is interesting for capturing functionally relevant sites, which can have implications in the fields of drug discovery but also in protein design.

Weaknesses:
The combination of coevolutionary information and metrics from MD simulations is not new as other protocols have been proposed along the years (the current version of the paper neglects some of them, see below), and there are a few parameters of the protocol that, in my opinion, should be better analyzed and discussed.

(1) As mentioned, the introduction of the paper lacks some important publications in the field of using graph theory to represent important interaction networks extracted from MD simulations (DOI: 10.1002/pro.4911), and also combining MD data with MSA to identify functionally relevant sites for enzyme design (doi: 10.1021/acscatal.4c04587, 10.1093/protein/gzae005).
(2) The matrix used to apply graph theory (J_ij) is built from summing the scaled coevolution and degree of correlation values. The alpha and beta weights are defined, and the authors mention that alpha is set to 0.5, thus beta as well to fulfil with the alpha + beta = 1. Why a value of 0.5 has been selected? How this affects the overall results and conclusions extracted? The finding that many catalytically relevant residues are identified in the communities is not surprising given that such sites usually present a high conservation score.
(3) Another important point that needs further explanation is the selection of the relevant descriptor of protein dynamics. In this study two different strategies have been used (one more global the other more local), but more details should be provided regarding their choice. What is the best strategy according to the authors? Why not using the same strategy for both related systems? The obtained results using one methodology or the other will have a large impact on the dynamical score. Another related point is: what is the impact of the MD simulation length, how the MSA is generated and number of sequences used for MSA construction?

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation