Single cell multiomics analysis of CML and normal bone marrow (nBM) by CITE-seq

A. CML LSC and progenitor populations from nine CML BM samples at diagnosis were FACS sorted and subjected to CITE-seq. CML heterogeneity was defined by label transfer through cell projection onto a reference UMAP of age-matched nBM from two donors (Sommarin et al., 2021). Diagnostic samples were retrospectively stratified by BCR::ABL International Scale (IS) % after 12 months of TKI treatment, according to the European LeukemiaNet recommendations.

(B) UMAP embedding of 4,696 Lin-CD34+ nBM cells (n = 2) with 11 clusters identified by Leiden clustering; annotated by marker genes and ADT expression (HSC = hematopoietic stem cell; MPP = multipotent progenitor population 1; My/Ly = myeloid, lymphoid progenitors; My = myeloid progenitors; Ly/pDC/mono = lymphoid, dendritic, monocytic progenitors; MPP2 = multipotent progenitor population 2; MEP = megakaryocytic erythroid progenitors); MkP = megakaryocytic erythroid progenitors; ErP = erythroid progenitors; Baso/mc = basophilic, mast cell progenitors; Cycling = Cycling progenitors). The bar plot below shows the percentage of cells per cluster.

(C) Dot plot showing the ADT expression within clusters. The color indicates mean ADT expression (red = high expression, blue = low expression); dot size represents the fraction of cells with expression.

(D) Dot plot showing mRNA expression of the top 10 marker genes per cluster. The color indicates mean RNA expression (red = high expression, blue = low expression); dot size represents the fraction of cells with expression.

Single cell maps of heterogeneity across patients and identification of a pan-stem and progenitor gene signature for CML

(A) UMAP embedding of Lin-CD34+ sorted CML BM cells from nine CML patients at diagnosis (14, 274 cells across patients). Cell color indicate cluster identity after label transfer through cell projection onto an aged-matched nBM reference (the first UMAP from the left).

(B) Bar plots showing the log2 fold change in cluster distribution (%) of Lin-CD34+ CML BM cells (n = 9) compared to Lin-CD34+ from nBM. Error bars depict standard deviation.

(C) Volcano plots showing differentially expressed genes between the Lin-CD34+ CML clusters (across all patients, n = 9) and the corresponding clusters from nBM (adjusted p-value < .01, log2 fold change > 1/< -1). Red and black dots represent genes uniquely up- or down-regulated per cluster comparison (genes not found significantly changed in any other cluster DEG analysis). The top 10 significant, unique genes are labelled in plot. Vertical dotted lines mark a log2 fold change equal to 1 and -1.

(D) Heatmap showing the average expression of the 50 significantly up-regulated and 21 down-regulated CML signature genes (genes significantly changed and consistent through all clusters DEG analyses) across clusters from all CML patients as well as nBM (adjusted p-value < .01, log2 fold change > 1/< -1). Red indicates high expression, blue low expression.

Cell heterogeneity at diagnosis relates to TKI therapy outcome in CML

(A) Stacked bar plots showing the cluster distribution within Lin-CD34+ compartment in all CML patients at diagnosis as well as nBM. CML patients are ordered by BCR::ABL1IS (%) following 12 months of TKI therapy (M12) and retrospectively stratified as per European LeukemiaNet recommendations; Optimal = CML1-4 (≤ 0.1 %), Warning = CML5-7 (> 0.1-1 %), Failure = CML8-9 (> 1 %). The total number of Lin-CD34+ cells from individual patients is indicated at the bottom.

(B) Box plots comparing cluster proportions in optimal responders (n = 4, CML patients 1-4) and treatment failures (n = 2, CML patients 8-9) at diagnosis; the patients were retrospectively stratified according to BCR::ABL1IS (%) after 12 months of TKI therapy (Optimal ≤ 1 %), Failure > 1 %).

(C) A scheme for computational deconvolution of bulk transcriptomes from patients into constituent cell populations; using CITE-seq derived gene signatures from Lin- CD34+ nBM as reference, an independent bulk CD34+ microarray dataset from CML patients (n = 59) was deconvoluted into constituent cell populations.

(D) Stacked bar plots showing percentage of specific clusters within CD34+ cells from individual CML patients (n = 59). The x-axis shows individual GSE ID for patients, y-axis shows percentage of clusters with similar color code as used in Fig. 1-2 UMAPs. Annotation of individual patient as per the original study (McWeeney et al., 2010); non-responders with > 65 % Ph+ metaphases (did not achieve even a minor cytogenetic response), responders with 0 % Ph+ metaphases after 12 months of Imatinib therapy (achieved CCyR). BM and PB represent CD34+ cells isolated from bone marrow, and peripheral blood respectively.

(E) The fold change in proportions for cell clusters between non-responders and responders (annotation as described above) for CD34+ cells isolated from bone marrow; statistical significance shown by asterisk *; student t-test, p-value < .05.

Identification of BCR::ABL1+ and BCR::ABL1- primitive cells and their surface markers by joint analysis of single cell gene expression and multiplexed antibody derived sequence tags (ADTs)

(A) Overview of analysis steps to generate Figure 4B. (1) Lin-CD34+CD38-/low cells were projected onto Lin-CD34+ nBM. Subsequently, the primitive cluster CITE-seq data from all CML patients and nBM (negative control) were merged and visualized together in one UMAP. BCR::ABL1+ gene signatures from BCR::ABL1 targeted SMART-seq (Giustacchini et al., 2017) was used to define primitive cluster cells from the individual patients as either BCR::ABL1+ or BCR::ABL1- (2) Lin-CD34+ and Lin- CD34+CD38-/low CITE-seq data from the same patient was merged and visualized together in a UMAP (3) BCR-ABL status from the Lin-CD34+CD38-/low primitive cells was linked to the joint UMAPs by matching cell IDs and cell were colored as either red (BCR::ABL1+) or black (BCR::ABL1-) (4) The log2 fold change in ADT expression between primitive Lin-CD34+CD38-/low BCR::ABL1+ (red) and BCR::ABL1- (black) cells was calculated and the remaining cells were colored according to their cluster annotation given after projection onto Lin-CD34+ nBM.

(B) UMAP plots showing the merged CITE-seq data of Lin-CD34+ and Lin- CD34+CD38-/low sorted populations for CML2-9 individually. Cells are color-coded according to cluster annotation given following projection onto Lin-CD34+ nBM. Lin- CD34+CD38-/low primitive cluster cells are annotated as BCR::ABL1+ (red) or BCR::ABL1- (black) and display enrichment of BCR::ABL1+ LSC signatures and non-leukemic stem cell signatures respectively. The red and black bars in the bar plot below indicate ADTs with significant changes in expression (p-value < .05, log2 fold change >1 / < -1) for BCR::ABL1+ and BCR::ABL1- cells, respectively.

(E) Volcano plot showing differentially expressed genes between CML Lin- CD34+CD38-/low Primitive BCR::ABL1+ and BCR::ABL1- cells. Red and black dots represent significant up- and down-regulated genes (adjusted p-value < .01, log2 fold change > 1/ < -1). The top 10 significant genes are labelled. Vertical dotted lines mark a fold change equal to 1 and -1.

(F) UMAP embedding of CML Lin-CD34+ cells from one representative CML patient at diagnosis (CML5). First plot from the left show cells colored according to the cluster identity given after mapping to nBM, the primitive cells are colored yellow. The following two UMAPs of Lin-CD34+ show the relative mRNA expression of the BCR::ABL1+ LSC signature and BCR::ABL1- non-leukemic stem cell gene signature; scale: red = high expression, blue = low expression.

Assessment of surface marker combinations to capture molecularly defined BCR::ABL1+ and BCR::ABL1- primitive cells

(A) Heatmap comparing ADT expression between BCR::ABL1+ and BCR::ABL1+ cells within the Lin-CD34+CD38-/low primitive cluster across CML patients (black = surface markers significantly up-regulated in BCR::ABL1- cells (log2 fold change < - 1, p-value < .05); red = surface markers significantly up-regulated in BCR::ABL1+ cells (log2 fold change > 1, p-value < .05); grey = non-significant change in ADT expression; white = surface marker was not present in the ADT panel used for the specific patient).

(B) Visualization of the relative ADT expression of CD25, CD26, CD93, CD117 and CD35 in the UMAPs of Lin-CD34+ cells from CML patient 5 and 9 (red = high expression, blue = low expression).

(C) Assessment of a selection of ADTs; CD25, CD26, CD93, CD117 and CD35 capacity to capture BCR::ABL1+ primitive, BCR::ABL1- primitive, all primitive cells, and specific progenitors across patients by gating on their expression within the Lin- CD34+ compartment (x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample).

(D) Assessment of specific ADTs: CD26, CD25 and CD35 capacity to capture BCR::ABL1+ primitive versus BCR::ABL1- primitive cluster within the Lin- CD34+CD38-/low compartment across patients; x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample.

(E) Assessment of specific combinations of ADTs: CD26+CD35-, CD26-CD35+, CD26- CD35-, and CD26+CD35+ capacity to capture BCR::ABL1+ primitive versus BCR::ABL1- primitive cluster within the Lin-CD34+CD38-/low compartment across patients; x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample.

Analysis of CD26+CD35- vs. CD26-CD35+ cells for BCR::ABL1 transcript expression, division kinetics in vitro, and sustenance through TKI therapy

(A) Real time qPCR for BCR::ABL1 fusion transcript in CD26+CD35- (n = 3, CML10-11 and CML20) vs. CD26-CD35+ (n = 2, CML10-11) sorted cells from Lin- CD34+CD38-/lowCD45RA- compartment using Taqman probe against BCR::ABL1; GAPDH served as the control.

(B) Cumulative plots of time to first division kinetics from single CD26+CD35- or CD26-CD35+ cells within Lin-CD34+CD38-/lowCD45RA- population are shown for each CML patient at diagnosis. Dead cells and non-dividing cells were excluded from the analysis.

(C) Mean time to first division (hours) was calculated for the two patients analyzed (p-value = ns., .84; paired t-test).

(D) Representative FACS plots showing the percentage CD26+CD35- (LSC) and CD26- CD35+ (HSC) cells within the Lin-CD34+CD38-/low compartment. Left panel: ADT gated CD26+CD35- and CD26-CD35+ cells within Lin-CD34+CD38-/low CITE-seq data from selected optimal responder (CML4) and treatment failure (CML8) at diagnosis. Right panel: FACS gated Lin-CD34+CD38-/lowCD26+CD35- and CD26-CD35+ cells in selected optimal responder and treatment failure at diagnosis versus 3 months of TKI therapy (Bosutinib).

(E) Bar plots showing the percentage of CD26+CD35- and CD26-CD35+ cells within Lin-CD34+CD38-/low compartment from optimal responders (n = 6, CML12 and CML14-18) and treatment failures (n = 2, CML13 and CML19) at diagnosis versus 3 months of TKI therapy (Bosutinib) determined using FACS.

(A) Distribution of annotated cell clusters across the two Lin-CD34+ nBM samples.

(B) Cell cycle scoring: G1, S, G2/M of UMAP embedded Lin-CD34+ cells from nBM using Scarf.

(A) UMAP plots showing the mapping score for the individual Lin-CD34+ CML samples at diagnosis when mapped onto the nBM reference; mapping was performed using Scarf (Dhapola et al., 2022). Circle size indicates mapping score and circle color specify cluster identity of the reference cell.

(B) Bar plots showing the log2 fold change in cluster distribution (%) within the CML Lin-CD34+ compartment at diagnosis when compared to nBM for the individual CML patients.

(A) Volcano plots showing differentially expressed genes between Lin-CD34+ CML (n = 9) at diagnosis and nBM cells across clusters (DEseq2). Red and black dots represent genes found to be consistently significant up or down-regulated across all cluster comparisons in CML, respectively (adjusted p-value < .01, log2 fold change = >1 / < - 1). Vertical dotted line marks a log2 fold change equal to 1 and -1.

(B) The fold change in proportions for cell clusters between all non-responders and responders (CD34+ cells isolated from both PB and BM); statistical significance shown by asterisk *; student t-test, p-value < .05.

(C) Assessment of the CML related ADTs; CD25, CD26, CD93 capacity to capture primitive cells, and specific progenitors across patients by gating on their expression within the Lin-CD34+ compartment (x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample).

(A) UMAP plots showing the mapping score for individual Lin-CD34+CD38-/low CML samples (CML2-8) at diagnosis and Lin-CD34+CD38-/low normal BM when mapped onto the Lin-CD34+ nBM reference; mapping was performed using Scarf. Circle size indicates mapping score and circle color specify cluster identity of the reference cell.

(B) UMAP embedding of Lin-CD34+CD38-/low primitive cluster CML (n = 8, a total of 2778 cells) and normal BM cells showing clusters following Leiden clustering analysis.

(C) UMAP embedding of Lin-CD34+CD38-/low primitive cluster CML (n = 8, a total of 2778 cells) at diagnosis and normal BM cells where cells are colored by sample identity.

(D) Same UMAP as in B-C above, showing the relative expression of BCR::ABL1+ LSC (left panel) and BCR::ABL1- LSC signature genes (right panel) used to define BCR::ABL1+ primitive cells (Signature genes are from DEG analysis comparing CP-CML Lin-CD34+CD38-/low BCR::ABL1+ and BCR::ABL1- (Giustacchini et al., 2017).

(E) Same UMAP as in B-C above, showing the relative expression of BCR::ABL1+ LSC (left panel) and nBM HSC signature genes (right panel) used to define BCR::ABL1+ primitive cells (Signature genes are from DEG analysis comparing CP-CML Lin-CD34+CD38-/low BCR::ABL1+ and nBM Lin-CD34+CD38-/low cells (Giustacchini et al., 2017).

(F) Quantification of Lin-CD34+CD38-/low primitive BCR::ABL1+ (red) and BCR::ABL1- (black) cells per CML patient at diagnosis. Total primitive cells per patient is indicated at the bottom.

(A) UMAP plots where CITE-seq data from Lin-CD34+ and Lin-CD34+CD38-/low sorted cells from the same CML patients have been merged, and Lin-CD34+CD38-/low cells are highlighted in blue.

(B) UMAP plots where CITE-seq data from Lin-CD34+ and Lin-CD34+CD38-/low sorted cells from the same CML patients have been merged, where Lin-CD34+CD38-/low primitive annotated cells are highlighted in blue.

(A) UMAP plots of Lin-CD34+ cells from CML patients 1-4 and 6-9, showing cells colored according to cluster identity given after mapping to nBM (left panel); the relative expression of Lin-CD34+CD38-/low Primitive BCR::ABL1+ signature genes (center panel) and the relative expression of Lin-CD34+CD38-/low Primitive BCR::ABL1- signature genes (right panel). DEG analysis is shown in Figure 4C and relative expression of signatures for CML5 is found in Figure 4D. (red = high expression, blue = low expression).

(A) UMAP plots of CML Lin-CD34+ primitive cluster cells from all CML patients (n = 9, a total of 1 106 cells) showing clusters following Leiden clustering analysis.

(B) UMAP plots of CML Lin-CD34+ primitive cluster from all CML patients (n = 9, a total of 1 106 cells) where the cells are colored as per the sample identity

(C) UMAP plots of CML Lin-CD34+ primitive cluster cells from all CML patients showing the relative expression of two DEG signatures used to define BCR::ABL1+ primitive cells: (1) BCR::ABL1+ LSC vs nBM HSC signature genes (top row, left panel = up-regulated genes in BCR::ABL1+ LSCs, right panel = genes up-regulated in nBM HSCs), (2) BCR::ABL1+ LSC vs BCR::ABL1- LSC (bottom row, left panel = genes up-regulated in BCR::ABL1+ LSC, right panel = genes up-regulated in BCR::ABL1- LSC) (Giustacchini et al., 2017).

(D) UMAP plots of Lin-CD34+ cells from individual CML patients (1-9) showing primitive cluster cells (yellow cluster, left panel) annotated as BCR::ABL1+ (red) and BCR::ABL1- (black)(right panel).

(E) Quantification of Lin-CD34+ primitive BCR::ABL1+ (red, cluster 2-3 Supp. Fig. 7A) and BCR::ABL1- (black, cluster 1 Supp. Fig. 7A) cells per CML patient at diagnosis. Total primitive cells are indicated at the bottom.

(A) UMAP plots of Lin-CD34+ cells from CML patients 1-4 and 6-8 showing the relative ADT expression of a selection of markers (CD25, CD93 CD26, CD117, CD35) (red = high expression, blue = low expression). ADT expression for CML patient 5 and 9 is shown in Fig. 5B.

(A) Plots showing representative gates on ADT expression for CD26, CD35, CD117, CD25, CD93 within the Lin-CD34+ CITE-seq data. Gates are shown for CML5 and CML9.

(B) Plots showing the gating on CD26 and CD35 ADT expression within the Lin- CD34+CD38-/low CITE-seq data for CML patients 2-3, 5-7 and 9. Plots for CML patient 4 and 8 is shown in Fig. 6D.