Analytical process on the artificial dyntoy dataset

Dashed lines mark selections made by the user, panels A-C show a scheme of the whole dataset, panels D-I correspond to the interactive GUI for this dataset: (A) The scheme of a point cloud representing single-cell data, where sparse regions α and β are present. (B) The dataset is represented as a k-NNG and the cell of origin is selected. (C) Pseudotime is calculated as expected hitting distance and the edges of the k-NNG are oriented according to the pseudotime. Candidate endpoints are automatically suggested as vertices without outgoing edges (circled). (D) Estimated pseudotime is represented by a yellow-to-red color gradient, the origin is shown as the purple dot. Potential developmental fates (x, y, z) are shown as gray points on the vaevictis plot. Endpoints can be selected for further investigation (in this example, all endpoints are selected, dashed line). (E) Sparse regions are represented on a persistence diagram, which enables the selection of the significant sparse regions (dashed line). The sparse regions γ and δ correspond to essential classes created by the selection of disparate endpoints. This way, random walks are classified using the selected ends and sparse regions. (F) All random walks are shown on the vaevictis plot and their classification into trajectories is depicted in the hierarchical clustering dendrogram. (G) By selecting branch (a) leading to the endpoint (x) of this dendrogram, the given trajectory can be investigated in detail, including viewing the average expression of multiple markers along pseudotime. (H) By selecting two endpoints (x and y) but only a single relevant marker, its dispersion and possible branching points can be examined. Trajectories (a and e) from panel F are visualized and the branching region (p) is selected. (I) Multiple trajectories (trajectory i in red and a and e in blue) can be visualized on the 2D plot. Points in the selected branching region (p) are highlighted in green. (J) Connectome representing a basic structure of the data and the simulated random walks. The pie charts indicate the distributions of cell populations in each vertex (cluster). The arrows show the direction of pseudotime. Vertex containing the cell of origin (O) and vertices containing endpoints (T).

Visualization of the basic structure of the T-cell compartment using vaevictis

(A) Bivariate CD4 x CD8 dot plot of cells from human thymus (brown) and human peripheral blood (blue) acquired using mass cytometry. The cells were gated as DN, DP, CD4 SP and CD8 SP. (B) vaevictis plot of 37-parameter (including barcode) mass cytometry panel measurement showing the positions of human thymus (brown) and human peripheral blood (blue), with the CD34pos progenitors shown in green. (C) Expression of CD8, CD4 and Annexin V shown using a blue-green-yellow-red color gradient on the vaevictis plot. Blue color indicates the lowest expression and red color indicates the highest.

Developmental endpoints and detailed analysis of major trajectory leading to endpoint #1 by tviblindi

(A) vaevictis plot of T-cell development in thymus and peripheral blood. Estimated pseudotime is represented by a yellow-to-red color gradient with CD34pos progenitors as the population of origin (purple dot). Gray dots indicate the discovered developmental endpoints. Endpoint #1 highlighted by a blue rectangle represents mature CD4pos effector memory T cells selected for further exploration. (B) Connectome shows low resolution structure of the data and of simulated random walks. (C) Persistence diagram representing sparse regions detected within the point cloud of measured cells. The orange rectangle marks a user-defined selection of sparse regions. (D) Dendrogram of clustered trajectories shows a subcluster of 452 walks (red rectangle, labeled I) within a larger group of random walks (blue rectangle, labeled II). The number to the left of each leaf indicates the number of random walks in the leaf. (E) vaevictis plot displaying the above selected trajectories in corresponding colors. (F) Pseudotime line plot showing the average expression of selected individual markers along the trajectory to endpoint #1 (top). The selected areas of interest corresponding to T-cell developmental stages (green rectangles) are shown in green (as indicated by the arrows) on the vaevictis plots (below).

tviblindi analysis of trajectories leading to apoptosis

(A) Dendrogram identical to Figure 3D with additional trajectories selected for closer investigation (blue rectangles) labeled III (137 random walks) and V (159 random walks). (B) vaevictis plot showing the topology of trajectories in leaves III and V. Apoptotic cells are shown in green. (C) Pseudotime line plot, which shows the average expression of selected markers along calculated pseudotime. The green rectangle highlights the region with increased expression of apoptotic marker Annexin V and decreased expression of phosphotyrosine. Events in the selected region are displayed in panel 4B in green (see the arrow). (D) A detailed dendrogram of leaf V from panel 4A. Two distinct trajectories were selected for further analysis (blue rectangle, Va and red rectangle, Vb). (E) vaevictis plot showing the topology of trajectories in leaves Va and Vb. The green polygon marks the region of apoptosis, the gray polygon marks the region of more advanced stages of thymic and peripheral CD4 T-cell development. (F) Pseudotime line plot, which depicts the average expression of selected markers along the trajectories Va and Vb. The region of apoptosis and the region of more advanced stages of CD4 T-cell development are highlighted (green and gray rectangle respectively).

tviblindi analysis of the trajectory leading to End#6

(A) A trajectory leading to End#6 located in the thymic portion of the vaevictis plot (compare to Figure 2B). (B) Pseudotime line plot showing average expression of selected individual markers along developmental pseudotime. (C) vaevictis plot showing the trajectory to the conventional naive CD4 SP T cells (blue) and the trajectory to End#6 (orange). Conventional naive CD4 SP T cells are shown in purple, the immature Treg stage in brown and End#6 in orange. (D) Bivariate plot of CD25 and CD127 expression on gated conventional naive CD4 SP T cell (purple), immature Treg stage (brown) and End#6 (orange). (E) Bivariate plot of Helios and FOXP3 expression on conventional naive CD4 SP T cell (purple), immature Treg stage (brown) and End#6 (orange) in a validation experiment.

Detailed analysis of End#6

(A) Visualization of the TREC/TCRAC ratio, depicted as the calculated number of divisions for each population. Symbol code: cells from peripheral blood (blue), cells from thymus (brown), cells from adult donor (circle), cells from pediatric donor (triangle). (B) Pseudotime line plot depicting the expression of TIGIT, CD95, CD152, T-bet, CD69 and CD197 markers by cells from trajectory to End#6. (C) Overlaid histograms showing the expression of chemokine and cytokine receptors CD197, CD363, CD218 and CD196 in the respective populations. Symbol code: End#6 cells from thymus (orange), Tregs from peripheral blood (blue), immature Tregs from thymus (brown) and naive CD4 SP T cell from peripheral blood (purple).

tviblindi analysis of single-cell RNA data from the study of Park et al. (45) showing the trajectory to Treg population corresponding to End#6

For A and B, the origin (centroid of DN (early) population) is marked by a purple dot, the candidate endpoints with more than 1% of simulated random walks are indicated by black dots and the endpoints corresponding to the Treg-atlas are highlighted by a blue rectangle. (A) UMAP plot identical to the one shown in Figure 3A of published research by Park et al. (45) showing the populations using the annotations from the authors, legend at the bottom left. (B) vaevictis plot of the same data as in A with shown pseudotime represented by yellow-to-red color gradient. (C) vaevictis plot showing the trajectory leading to Treg-atlas. For (D-G), the individual markers are labeled in accordance with the original research by their gene names and the CD designated markers are shown in parentheses. (D) Pseudotime line plot of markers canonical to the Treg population. (E) Pseudotime line plot of markers associated with Treg activation overlapping with our mass cytometry data from End#6 cells shown in Figure 6B. (F) Pseudotime line plot of cytokine and chemokine receptors overlapping with our cytometry data from End#6 cells shown in Figure 6C. (G) Pseudotime line plot showing the expression of additional chemokine markers measured on Treg-atlas.