Bactabolize: A tool for high-throughput generation of bacterial strain-specific metabolic models

  1. Department of Infectious Diseases, Central Clinical School, Monash University, Melbourne, Victoria, Australia
  2. Microbiology Unit, Alfred Health, Melbourne, Victoria, Australia
  3. Department of Bioengineering, University of California, San Diego, CA, United States of America
  4. Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, UK

Peer review process

Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.

Read more about eLife’s peer review process.

Editors

  • Reviewing Editor
    Marisa Nicolás
    Laboratório Nacional de Computação Científica, Rio de Janeiro, Brazil
  • Senior Editor
    Christian Landry
    Université Laval, Québec, Canada

Reviewer #1 (Public Review):

In this work, Vezina et al. present Bactabolize, a rapid reconstruction tool for the generation of strain-specific metabolic models. Similar to other reconstruction pipelines such as CarveMe, Bactabolize builds a strain-specific draft reconstruction and subsequently gap-fills it. The model can afterwards be used to predict growth in any defined medium the user specifies. The authors constructed a pan-model of the Klebsiella pneumoniae species complex (KpSC) and used it as input for Bactabolize to construct a genome-sale reconstruction of K. pneumoniae KPPR1. They compared the generated reconstruction with a reconstruction built through CarveMe as well as a manually curated reconstruction for the same strain. They then compared predictions of carbon, nitrogen, phosphor, and sulfur sources and found that the Bactabolize reconstruction had the overall highest accuracy. Finally, they built draft reconstructions for 10 clinical isolates of K. pneumoniae and evaluated their predictive performance. Overall, this is a useful tool, the data is well-presented, and the paper is well-written. However, the predictions are only compared with two existing reconstruction tools though more have been recently published.

Reviewer #2 (Public Review):

The authors present a computational tool for high-throughput generation of bacterial strain-specific metabolic models. The study seems interesting. However, I have the following concerns.

1. In the results section "description of Bactabolize", the authors present technical details on how to generate a metabolic model. For the input and output, please provide concrete examples to show the functionality of Bactabolize.

2. KpSC pan-metabolic reference model is provided. Are they required as input for Bactabolize? Are the gene, metabolite information open accessible by users?

3. To generate metabolic models, the authors present comparison results with other methods. However, the authors only present the numbers in genes, metabolites and substrates. Since the interactions between gene, metabolite, and substrate are also critical, if possible, please provide the coverage details about these interactions. Venn diagram is recommended to compare these coverage differences.

4. Are quality control and gap-filling needed to be processed when constructing a new metabolic model?

5. Are there any visualization results to check the status of the generated draft model?

Reviewer #3 (Public Review):

The authors present a pipeline for generating strain-specific genome-scale metabolic models for bacteria using Klebsiella spp. as the demonstrative data. The proposed improvement of performance and accuracy in this process holds great value. However, the demonstrated evidence, justification, and validation methods require further discussion.

Apart from the claim to quickly and accurately produce strain-specific models, the manuscript highlights the need to create pan-metabolic models from manually curated models, which are relatively time-consuming and can only be done with well-established organisms. Therefore, claims to speed up the process are redundant.

The justification and evaluation of the generated models are inadequate and one-dimensional. The authors only focus on statistics such as the number of reactions and genes in the models, which does not accurately depict the completeness of the model.

Furthermore, the authors solely compare their results with the performance of the previously published CraveMe packages, and the results do not clearly demonstrate the superior performance of the Bactabolize tool that they developed.

The authors have not provided evidence or discussion on the accuracy of any metabolic fluxes, which are considered to be crucial for reconstructing metabolic models. Additionally, the authors have not mentioned the importance of non-growth associated maintenance and the criticality of biomass composition analysis, both of which significantly determine the fluxes in the system.

Overall, the work holds potential for direct application in certain specific aims and fields. However, the cryptic details and critical points of the justification regarding the completeness of the models require further discussion. A detailed discussion on the importance of manually curated models and the potential future direction of incorporating machine learning into the process would significantly enhance the quality of the manuscript.

  1. Howard Hughes Medical Institute
  2. Wellcome Trust
  3. Max-Planck-Gesellschaft
  4. Knut and Alice Wallenberg Foundation