Peer review process
Not revised: This Reviewed Preprint includes the authors’ original preprint (without revision), an eLife assessment, public reviews, and a provisional response from the authors.
Read more about eLife’s peer review process.Editors
- Reviewing EditorLuca PinelloMassachusetts General Hospital, Boston, United States of America
- Senior EditorChristian LandryUniversité Laval, Québec, Canada
Reviewer #1 (Public Review):
Cell type deconvolution is one of the early and critical steps in the analysis and integration of spatial omic and single cell gene expression datasets, and there are already many approaches proposed for the analysis. Sang-aram et al. provide an up-to-date benchmark of computational methods for cell type deconvolution.
In doing so, they provide some (perhaps subtle) additional elements that I would say are above the average for a benchmarking study: i) a full Nextflow pipeline to reproduce their analyses; ii) methods implemented in Docker containers (which can be used by others to run their datasets); iii) a fairly decent assessment of their simulator compared to other spatial omics simulators. A key aspect of their results is that they are generally very concordant between real and synthetic datasets. And, it is important that the authors include an appropriate "simpler" baseline method to compare against and surprisingly, several methods performed below this baseline. Overall, this study also has the potential to also set the standard of benchmarks higher, because of these mentioned elements.
The only weakness of this study that I can readily see is that this is a very active area of research and we may see other types of data start to dominate (CosMx, Xenium) and new computational approaches will surely arrive. The Nextflow pipeline will make the prospect of including new reference datasets and new computational methods easier.
Reviewer #2 (Public Review):
In this manuscript Sangaram et al provide a systematic methodology and pipeline for benchmarking cell type deconvolution algorithms for spatial transcriptomic data analysis in a reproducible manner. They developed a tissue pattern simulator that starts from single-cell RNA-seq data to create silver standards and used spatial aggregation strategies from real in situ-based spatial technologies to obtain gold standards. By using several established metrics combined with different deconvolution challenges they systematically scored and ranked 11 deconvolution methods and assessed both functional and usability criteria. Altogether, they present a reusable and extendable platform and reach very similar conclusions to other deconvolution benchmarking paper, including that RCTD, SpatialDWLS and Cell2location typically provide the best results.
More specifically, the authors of this study sought to construct a methodology for benchmarking cell type deconvolution algorithms for spatial transcriptomic data analysis in a reproducible manner. The authors leveraged publicly available scRNA-seq, seqFISH, and STARMap datasets to create synthetic spatial datasets modeled after that of the Visium platform. It should be noted that the underlying experimental techniques of seqFISH and STARMap (in situ hybridization) do not parallel that of Visium (sequencing), which could bias simulated data. Furthermore, to generate the ground truth datasets cells and their corresponding count matrix are represented by simple centroids. Although this simplifies the analysis it might not necessarily accurately reflect Visium spots where cells could lie on a boundary and affect deconvolution results. On the other hand, the authors state that in silver standard datasets one half of the scRNA-seq data was used for simulation and the other half was used as a reference for the algorithms, but the method of splitting the data, i.e., at random or proportionally by cell type, was not specified. Supplying optimal reference data is important to achieve best performance, as the authors note in their conclusions.
The authors thoroughly and rigorously compare methods while addressing situational discrepancies in model performance, indicative of a strong analysis. The authors make a point to address both inter- and intra- dataset reference handling, which has a significant impact on performance. Major strengths of the simulation engine include the ability to downsample and recapitulate several cell and tissue organization patterns.
It's important to realize that deconvolution approaches are typically part of larger exploratory data analysis (EDA) efforts and require users to change parameters and input data multiple times. Furthermore, many users might not have access to more advanced computing infrastructure (e.g. GPU) and thus running time, computing needs, and scalability are probably key factors that researchers would like to consider when looking to deconvolve their datasets.
The authors achieve their aim to benchmark different deconvolution methods and the results from their thorough analysis support the conclusions that many methods are still outperformed by bulk deconvolution methods. This study further informs the need for cell type deconvolution algorithms that can handle both cell abundance and rarity throughout a given tissue sample.
The reproducibility of the methods described will have significant utility for researchers looking to develop cell type deconvolution algorithms, as this platform will allow simultaneous replication of the described analysis and comparison to new methods.
Reviewer #3 (Public Review):
The authors thoroughly evaluate the performance and scalability of existing cell-type deconvolution methods. The paper builds on the existing knowledge by considering the suitability of deconvolution algorithms in the context of more challenging analyses where rare cell types are present or when dealing with unmatched references or noise introduced by a highly abundant cell type within the data. The paper also presents a new simulation framework for spatial transcriptomics data to support their benchmarking effort.
● Major strengths and weaknesses of the methods and results.
While most of the benchmarking studies rely on publicly available spatial transcriptomics datasets, one of the major strengths of the paper is the additional evidence support from their silver standard datasets. Leveraging computational processes synthspot, the authors generated abundant synthetic spatial transcriptomics data with replicates. In addition, the data generation process also accounts for 9 different biological patterns to stay close to real data quality. The authors also communicated with the original authors of each benchmarked method to ensure correct implementation and optimal performance. Figure 2 provides a clear and concise summary of the benchmark results, which will be of great assistance to users who are contemplating conducting deconvolution analysis.
The simulation setup has a significant weakness in the selection of reference single-cell RNAseq datasets used for generating synthetic spots. It is unclear why a mix of mouse and human scRNA-seq datasets were chosen, as this does not reflect a realistic biological scenario. This could call into question the findings of the "detecting rare cell types remains challenging even for top-performing methods" section of the paper, as the true "rare cell types" would not be as distinct as human skin cells in a mouse brain setting as simulated here. Furthermore, it is unclear why the authors developed Synthspot when other similar frameworks, such as SRTsim, exist. Have the authors explored other simulation frameworks? Finally, we would have appreciated the inclusion of tissue samples with more complex structures, such as those from tumors, where there may be more intricate mixing between cell types and spot types.
The authors have effectively accomplished their objectives in benchmarking deconvolution methods by thoughtfully designing the experiments and selecting appropriate evaluation metrics. This paper will be highly beneficial for the community.
This paper can provide guidance for selecting the most proper deconvolution methods under user-decided scenarios of the interests. Synthspot, allows for generating more realistic artificial tissue data with specific spatial patterns and is integrated as part of an easy-to-use and adaptable Nextflow pipeline. It might be worthwhile to clearly differentiate this work from previous work either in the benchmarking area or SRT data simulation area.