ProteinConformers: large-scale and energetically profiled descriptions of protein conformational landscapes

  1. Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
  2. School of Computing, National University of Singapore, Singapore, Singapore
  3. School of Economics and Management, Xi′an University of Posts & Telecommunications, Xi'an, China
  4. Center for AI and Computational Biology, Institute of Systems Medicine, Chinese Academy of Medical Sciences, Beijing, China
  5. School of Advanced Interdisciplinary Science, University of Chinese Academy of Sciences, Beijing, China
  6. State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
  7. School of Life Science and Technology, Xi’an Jiaotong University, Xi'an, China
  8. NITFID, School of Statistics and Data Science, AAIS, LPMC, and KLMDASR, Nankai University, Tianjin, China
  9. Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Aaron Frank
    Arrakis Therapeutics, Waltham, United States of America
  • Senior Editor
    Qiang Cui
    Boston University, Boston, United States of America

Reviewer #1 (Public review):

Summary:

The authors describe a new database that rigorously explores protein conformations.

Strengths:

It is extremely well done, using state-of-the-art tools by a group at the top of the field of structural modeling. The evaluation of qualities and the benchmarking of the structures are outstanding, and it is expected that the new database will have a significant impact on the field.

Weaknesses:

The authors are using MD simulation to generate some of the structure, and therefore should have access to standard MD energies. I am surprised that no evaluation is provided based on these energies that can be extended to free energies.

Reviewer #2 (Public review):

Summary:

The authors developed a dataset of protein conformations by running molecular dynamics simulations starting from both native and decoy conformations for a large number of proteins. These conformations were put together as a dataset for querying and downloading, along with their energies under different force fields. The authors suggest that such conformations represent the proteins' conformational landscape, so that they will be useful for evaluating methods generating multiple conformations of proteins.

Strengths:

The dataset is online and working. It has good documentation for others to use.

Weaknesses:

The biggest weakness is that the collected conformations very likely do not represent the true conformational landscape. To represent the conformational landscape, the structures need to be sampled based on the Boltzmann distribution. However, in this study, conformations are generated by running very short (125ps to 375ps) MD simulations starting from near-native conformations and decoys. Such short simulations will produce small fluctuations around the starting conformations, so the distribution of conformations is largely dominated by the distribution of the initial conformations, which by one means are Boltzmann distributed. A conformation might be physically plausible, but it might have very small weight in the Boltzmann distribution. On the other hand, conformations with large weights might not be in the dataset.

Reviewer #3 (Public review):

Summary:

This manuscript describes a web-based tool that allows researchers to compare large numbers of representative ("plausible") conformations of proteins. It also includes energetic analysis from multiple widely used structure-prediction methods.

Strengths:

This tool will likely be useful for students who want to learn more about the ensemble properties of proteins. The resource is well organized and it represents a large amount of computing resources.

Weaknesses:

It is not entirely clear how the database may be utilized by other groups to advance research. It could be helpful if the authors add a short section that provides example use cases that illustrate how this database can support new strategies for studying protein dynamics.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation