Figures and data

The framework of TSvelo
(a) The preprocessing strategy in TSvelo, including velocity genes selection and initial state detection. (b) The Neural ODE model and its optimization in TSvelo, where the parameters and latent time is optimized iteratively. (c-e) The downstream application tasks of TSvelo, including precise transcription-unspliced-spliced 3D phase portrait fitting (c), cell fate prediction using predicted RNA velocity (d) and gene expression pattern analysis for multi-lineage dataset (e).

Results on pancreas dataset.
(a) The pseudotime learned with TSvelo. (b) The Stream-plot for visualization the RNA velocity inferred by TSvelo. (c) The quantitative comparison between TSvelo and multiple baseline approaches. The down, central and up hinges correspond to the first quartile, median value and third quartile, respectively. The whiskers extend to 1.5× the interquartile range of the distribution from the hinge. 3,696 samples are included in the boxplot for each method. (d) The dynamics fitting on MAML3, ANXA4 and GSTZ1. For each gene, four plots are displayed in a 2×2 layout: the u-t, s-t, u-s, and alpha-u plots. (e) The unspliced-spliced phase portrait fitting on MAML3, ANXA4 and GSTZ1 obtained by four baseline RNA velocity approaches, including scVelo, Dynamo, UniTVelo and cellDancer. (f) The dynamics fitting in transcription-unspliced-spliced 3D phase portrait for MAML3, ANXA4 and GSTZ1.

Results on gastrulation erythroid dataset.
(a) The GO terms which are mostly enriched in the selected velocity genes of TSvelo. (b) The pseudotime learned with TSvelo. (c) The Stream-plot for visualization the RNA velocity inferred by TSvelo. (d-f) The quantitative comparison between TSvelo and multiple baseline approaches in terms of velocity consistency (d), in-cluster coherence (e), and cross boundary direction correctness (f). The down, central and up hinges correspond to the first quartile, median value and third quartile, respectively. The whiskers extend to 1.5× the interquartile range of the distribution from the hinge. 9,815 samples are included in the boxplot for each method. (g) The dynamics fitting on HSP90AB1 obtained by TSvelo. Four plots are displayed in a 2×2 layout: the u-t, s-t, u-s, and alpha-u plots, where u, a and alpha mean the abundance of unspliced mRNA, the abundance of spliced mRNA, and the learned transcriptional representation, respectively. (i) The phase portrait fitting on HSP90AB1 obtained by baseline approaches. (i) The dynamics fitting on RPS26 obtained by TSvelo. (j) The TFs with the highest ranked weights as identified by TSvelo. (k) Weights for KLF1’s targets, with the highest absolute weight. (l) Temporal dynamics of KLF1 and its target genes with the highest weights along pseudotime, which includes HBA-X, ALAS2 and GYPA.

Results on mouse brain dataset.
(a) The velocity stream inferred by Multivelo. (b) The Stream-plot for visualization the RNA velocity inferred by TSvelo. (c) The pseudotime learned with TSvelo. (d-f) The dynamics fitting on gene MEIS2 (d), BASP1 (e) and MSI2 (f) by both Multivelo and TSvelo. In each panel, the leftmost plot shows the phase portrait fitting of Multivelo, and the next two columns show TSvelo’s results. The rightmost plot shows the learned transcriptional rate (in green), unspliced abundance (in blue), and spliced abundance (in red) along the pseudotime. Since the transcriptional rate is calculated for each individual cell, we apply a Generalized Additive Model (GAM) to transcriptional representation across cells along the pseudotime and present GAM-fitting results to better visualize its trends in these plots.

Results on the multi-lineage dentate gyrus dataset.
(a) The GO terms enriched in the selected velocity genes of TSvelo. (b) The pseudotime learned with TSvelo. (c) The Stream-plot for visualization the RNA velocity inferred by TSvelo. Three lineages are detected, which are Granule lineage, CA lineage and glial lineage. (d) The velocity stream inferred by scVelo. (e) The velocity stream inferred by cellDancer. (f-h) The dynamics modeling of TSvelo on three axonogenesis-related genes, ANK3 (f), MAP1B (g) and SLC1A2 (h). In the leftmost plot of each panel, the lines represent the predicted spliced abundance across all lineages, with the color indicating the cell types most associated with each pseudotime point along the corresponding lineage. Additionally, expression data for each lineage are shown as translucent points. The remaining plots in each panel display the dynamics of the learned transcriptional rate (in green), unspliced abundance (in blue), and spliced abundance (in red) along pseudotime for each lineage. The transcriptional representation in these plots is also processed using GAM fitting.

Results on the LARRY dataset.
(a) The Leiden clustering on LARRY. (b) The initial Leiden cluster detection in preprocessing. (c) The pseudotime learned with TSvelo. (d) The Stream-plot for visualization the RNA velocity inferred by TSvelo. (e) The GO terms enriched in the selected velocity genes of TSvelo. (f) The dynamics modeling of TSvelo on four genes related to neutrophil development, which are PYGL, MS4A3, CLEC12A and LTA4H. The lines represent the predicted spliced abundance across all lineages, with the color indicating the cell types most strongly associated with each pseudotime point along the corresponding lineage. (g) The dynamics modeling of PYGL, MS4A3, CLEC12A and LTA4H on the neutrophil lineage.