Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, and public reviews.
Read more about eLife’s peer review process.Editors
- Reviewing EditorMaría ZambranoCorpoGen, Bogotá, Colombia
- Senior EditorDetlef WeigelMax Planck Institute for Biology Tübingen, Tübingen, Germany
Reviewer #1 (Public Review):
This manuscript presents SAVEMONEY, a computational tool designed to enhance the utilization of Oxford Nanopore Technologies (ONT) long-read sequencing for the design and analysis of plasmid sequencing experiments. In the past few years, with the improvement in both sequencing length and accuracy, ONT sequencing is being rapidly extended to almost all omics analyses which are dominated by short-read sequencing (e.g., Illumina). However, relatively higher sequencing errors of long-read sequencing techniques including PacBio and ONT is still a major obstacle for plasmid/clone-based sequencing service that aims to achieve single base/nucleotide accuracy. This work provides a guideline for sequencing multiple plasmids together using the same ONT run without molecular barcoding, followed by data deconvolution. The whole algorithm framework is well-designed, and some real data and simulation data are utilized to support the conclusions. The tool SAVEMONEY is proposed to target users who have their own ONT sequencers and perform library preparation and sequencing by themselves, rather than relying on commercial services. As we know and discussed by the authors, in the real world, to ensure accuracy, the researchers will routinely pick up multiple colonies in the same plasmid construction and submit for Sanger sequencing. However, SAVEMONEY is not able to support the simultaneous analysis of multiple colonies in the same run, as compared to the barcoding-based approaches. This is a major limitation in the significance of this work. Encouraging computational efforts in ONT data debarcoding for mixed-plasmid or even single-cell sequencing would be more valuable in the field.
1. To provide more comprehensive information for users who care about the cost, the Introduction section should include a cost comparison between Sanger and ONT, with more details, such as different ONT platforms (MinION, PromethION, FlongIe), chemistries (flow cells) and kits. This additional information will be more helpful and informative for the users who have their own sequencers and are the target audience for SAVEMONEY.
2. In "Overview of the algorithm" (Pages 3-4) under the Results section, instead of stating "However, coverage varies from ~100-1000 and is difficult to predict because each nanopore flow cell has different properties.", it will be beneficial to provide more detailed information, such as sequencing length, yield/read count per flow cell of different platforms. This information will assist users in designing their own experiments effectively.
3. While this study optimized and evaluated the tool using a total of 14 plasmids, it may not provide sufficient power to represent the diversity of the plasmid world. Consideration should be given to expanding the dataset to include a broader range of plasmids in future studies to enhance the robustness and generalizability of the tool.
4. If applicable and feasible, including a comparison or benchmark of SAVEMONEY against other similar tools would further strengthen the manuscript. This comparison would allow users to evaluate the advantages and disadvantages of different tools for their specific needs.
5. The importance of pre-filtering raw sequencing reads should be emphasized as noisy reads can significantly impact the overall performance of the tool. It is essential to clarify whether any pre-filtering steps were performed in this study, such as filtering based on quality scores, read length, or other relevant factors.
6. The statement regarding the number of required reads per plasmid (20-30) and the maximum number of plasmids (up to six) that can be mixed in a single run may become outdated due to the rapid advancements in ONT technology. In the Discussion section, instead of assuming specific numbers, it would be more beneficial to provide information based on the current state of ONT sequencing, such as the number of reads per MinION flow cell that can be produced.
Reviewer #2 (Public Review):
The authors developed an algorithm that allows for deconvoluting of plasmid sequences from a mixture of plasmids that have been sequenced by nanopore long read technology. As library preparations and barcoding of individual samples increase sequencing costs, the algorithm bypasses this need and thus decreases time on sample prep and sequencing costs. In the first step, the tool assesses which of the plasmid constructions can be mixed in a single library preparation by calculating a distance matrix between the reference plasmid and the constructions producing sequence clusters. The user is given groups of plasmids, from different clusters, to be pooled together for sequencing. After sequencing, the algorithm deconvolutes the reads by classifying them based on alignments to the reference sequence. A Bayesian analysis approach is used to obtain a consensus sequence and quality scores.
Strengths
The authors exploit one of the main advantages of long-read sequencing which is to accurately resolve regions of high complexity, as regularly found in plasmids, and developed a tool that can validate plasmid constructions by reducing sequencing costs. Multiple plasmids (up to six) can be analyzed simultaneously in a single library without the need for sample barcoding, also reducing sample preparation time. Although inserts must be different, just 2 bases difference would be enough for a correct assignation. It maximizes cost-efficiency for projects that require large amounts of plasmid constructions and high-throughput validation.
Weaknesses
The method proposed by the authors requires prior knowledge of plasmid sequences (i.e., blueprints or plasmid reference) and is not suitable for small experiments. The plasmid inserts or backbones must be different e.g., multiple colonies from the same plasmid construction effort cannot be submitted together.