Single cell multiomics analysis of CML and normal bone marrow (nBM) by CITE-seq

(A) CML LSC and progenitor populations from nine CML BM samples at diagnosis were FACS sorted and subjected to CITE-seq. CML heterogeneity was defined by label transfer through cell projection onto a reference UMAP of age-matched nBM from two donors (Sommarin et al., 2021). Diagnostic samples were retrospectively stratified by BCR::ABL1 International Scale (IS) % after 12 months of TKI treatment, according to the European LeukemiaNet recommendations.

(B) UMAP embedding of 4,696 LinCD34+ nBM cells (n = 2) with 11 clusters identified by Leiden clustering; annotated by marker genes and ADT expression (HSC = hematopoietic stem cell; MPP = multipotent progenitor population 1; My/Ly = myeloid, lymphoid progenitors; My = myeloid progenitors; Ly/pDC/mono = lymphoid, dendritic, monocytic progenitors; MPP2 = multipotent progenitor population 2; MEP = megakaryocytic erythroid progenitors); MkP = megakaryocytic erythroid progenitors; ErP = erythroid progenitors; Baso/mc = basophilic, mast cell progenitors; Cycling = Cycling progenitors). The bar plot below shows the percentage of cells per cluster.

(C) Dot plot showing the ADT expression within clusters. The color indicates mean ADT expression (red = high expression, blue = low expression); dot size represents the fraction of cells with expression.

(D) Dot plot showing mRNA expression of the top 10 marker genes per cluster. The color indicates mean RNA expression (red = high expression, blue = low expression); dot size represents the fraction of cells with expression.

Single cell maps of heterogeneity across patients and identification of a pan-stem and progenitor gene signature for CML

(A) UMAP embedding of LinCD34+ sorted CML BM cells from nine CML patients at diagnosis (14, 274 cells across patients). Cell color indicates cluster identity after label transfer through cell projection onto an aged-matched nBM reference (the first UMAP from the left).

(B) Bar plots showing the log2 fold change in cluster distribution (%) of LinCD34+ CML BM cells (n = 9) compared to LinCD34+ from nBM. Error bars depict standard deviation.

(C) Volcano plots showing differentially expressed genes between the LinCD34+ CML clusters (across all patients, n = 9) and the corresponding clusters from nBM (adjusted p-value < .01, log2 fold change > 1/< −1). Red and black dots represent genes uniquely up- or down-regulated per cluster comparison (genes not found significantly changed in any other cluster DEG analysis). The top 10 significant, unique genes are labelled in plot. Vertical dotted lines mark a log2 fold change equal to 1 and −1.

(D) Heatmap showing the average expression of the 50 significantly up-regulated and 21 down-regulated CML signature genes (genes significantly changed and consistent through all clusters DEG analyses) across clusters from all CML patients as well as nBM (adjusted p-value < .01, log2 fold change > 1/< −1). Red indicates high expression, blue low expression.

Cell heterogeneity at diagnosis relates to TKI therapy outcome in CML

(A) Stacked bar plots showing the cluster distribution within LinCD34+ compartment in all CML patients at diagnosis as well as nBM. CML patients are ordered by BCR::ABL1IS (%) following 12 months of TKI therapy (M12) and retrospectively stratified as per European LeukemiaNet recommendations; Optimal = CML1-4 (≤ 0.1 %), Warning = CML5-7 (> 0.1-1 %), Failure = CML8-9 (> 1 %). The total number of LinCD34+ cells from individual patients is indicated at the bottom.

(B) Box plots comparing cluster proportions in optimal responders (n = 4, CML patients 1-4) and treatment failures (n = 2, CML patients 8-9) at diagnosis; the patients were retrospectively stratified according to BCR::ABL1IS (%) after 12 months of TKI therapy (Optimal ≤ 1 %), Failure > 1 %).

(C) A scheme for computational deconvolution of bulk transcriptomes from patients into constituent cell populations; using CITE-seq derived gene signatures from Lin CD34+ nBM as reference, an independent bulk CD34+ microarray dataset from CML patients (n = 59) (McWeeney et al., 2010) was deconvoluted into constituent cell populations using CIBERSORTx (Newman et al., 2019)

(D) Stacked bar plots showing percentage of specific clusters within CD34+ cells from individual CML patients (n = 59). The x-axis shows individual GSE ID for patients, y-axis shows percentage of clusters with similar color code as used in Fig. 1-2 UMAPs. Annotation of individual patient as per the original study (McWeeney et al., 2010); non-responders with > 65 % Ph+ metaphases (did not achieve even a minor cytogenetic response), responders with 0 % Ph+ metaphases after 12 months of Imatinib therapy (achieved CCyR). BM and PB represent CD34+ cells isolated from bone marrow, and peripheral blood respectively.

(E) The fold change in proportions for cell clusters between non-responders and responders (annotation as described above) for CD34+ cells isolated from bone marrow; statistical significance shown by asterisk *; student t-test, p-value < .05.

Identification of BCR::ABL1+ and BCR::ABL1 primitive cells and their surface markers by joint analysis of single cell gene expression and multiplexed antibody derived sequence tags (ADTs)

(A) Overview of analysis steps to generate Figure 4B. (1) LinCD34+CD38−/low cells were projected onto LinCD34+ nBM. Subsequently, the primitive cluster CITE-seq data from all CML patients and nBM (negative control) were merged and visualized together in one UMAP. BCR::ABL1+ gene signatures from BCR::ABL1 targeted SMART-seq (Giustacchini et al., 2017) was used to define primitive cluster cells from the individual patients as either BCR::ABL1+ or BCR::ABL1 (2) LinCD34+ and Lin CD34+CD38−/low CITE-seq data from the same patient was merged and visualized together in a UMAP (3) BCR-ABL status from the LinCD34+CD38−/low primitive cells was linked to the joint UMAPs by matching cell IDs and cell were colored as either red (BCR::ABL1+) or black (BCR::ABL1) (4) The log2 fold change in ADT expression between primitive LinCD34+CD38−/low BCR::ABL1+ (red) and BCR::ABL1 (black) cells was calculated and the remaining cells were colored according to their cluster annotation given after projection onto LinCD34+ nBM.

(B) UMAP plots showing the merged CITE-seq data of LinCD34+ and Lin CD34+CD38−/low sorted populations for CML2-9 individually. Cells are color-coded according to cluster annotation given following projection onto LinCD34+ nBM. Lin CD34+CD38−/low primitive cluster cells are annotated as BCR::ABL1+ (red) or BCR::ABL1 (black) and display enrichment of BCR::ABL1+ LSC signatures and non-leukemic stem cell signatures respectively. The red and black bars in the bar plot below indicate ADTs with significant changes in expression (p-value < .05, log2 fold change >1 / < −1) for BCR::ABL1+ and BCR::ABL1 cells, respectively.

(C) Volcano plot showing differentially expressed genes between CML Lin CD34+CD38−/low Primitive BCR::ABL1+ and BCR::ABL1 cells. Red and black dots represent significant up- and down-regulated genes (adjusted p-value < .01, log2 fold change > 1/ < −1). The top 10 significant genes are labelled. Vertical dotted lines mark a fold change equal to 1 and −1.

(D) UMAP embedding of CML LinCD34+ cells from one representative CML patient at diagnosis (CML5). First plot from the left show cells colored according to the cluster identity given after mapping to nBM, the primitive cells are colored yellow. The following two UMAPs of LinCD34+ show the relative mRNA expression of the BCR::ABL1+ LSC signature and BCR::ABL1 non-leukemic stem cell gene signature; scale: red = high expression, blue = low expression.

Assessment of surface marker combinations to capture molecularly defined BCR::ABL1+ and BCR::ABL1 primitive cells

(A) Heatmap comparing ADT expression between BCR::ABL1+ and BCR::ABL1 cells within the LinCD34+CD38−/low primitive cluster across CML patients (black = surface markers significantly up-regulated in BCR::ABL1 cells (log2 fold change < −1, p-value < .05); red = surface markers significantly up-regulated in BCR::ABL1+ cells (log2 fold change > 1, p-value < .05); grey = non-significant change in ADT expression; white = surface marker was not present in the ADT panel used for the specific patient).

(B) Visualization of the relative ADT expression of CD25, CD26, CD93, CD117 and CD35 in the UMAPs of LinCD34+ cells from CML patient 5 and 9 (red = high expression, blue = low expression).

(C) Assessment of a selection of ADTs; CD25, CD26, CD93, CD117 and CD35 capacity to capture BCR::ABL1+ primitive, BCR::ABL1 primitive, all primitive cells, and specific progenitors across patients by gating on their expression within the Lin CD34+ compartment (x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample).

(D) Assessment of specific ADTs: CD26, CD25 and CD35 capacity to capture BCR::ABL1+ primitive versus BCR::ABL1 primitive cluster within the Lin CD34+CD38−/low compartment across patients; x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample.

(E) Assessment of specific combinations of ADTs: CD26+CD35, CD26CD35+, CD26 CD35, and CD26+CD35+ capacity to capture BCR::ABL1+ primitive versus BCR::ABL1 primitive cluster within the LinCD34+CD38−/low compartment across patients; x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample.

Analysis of CD26+CD35 vs. CD26CD35+ cells for BCR::ABL1 transcript expression, and response to TKI therapy

(A) Real time qPCR for BCR::ABL1 in CD26CD35+ (n = 7, CML patients 10-12, 14-15, 17-18) vs CD26+CD35 (n = 9, CML patients 10-15, 17-18 and 20) within the Lin CD34+CD38−/low compartment at diagnosis and following 3 months of Bosutinib therapy (n = 3, CML patients 17, 18, 21). GAPDH served as control.

(B) Real time qPCR for GAS2 in CD26CD35+ (n = 3, CML patients 14, 17-18) vs CD26+CD35 (n = 4, CML patients 13-14, 17-18) within the LinCD34+CD38−/low compartment at diagnosis and following 3 months of Bosutinib therapy (n = 3, CML patients 17-18, 21). GAPDH served as control.

(C) Representative FACS plots showing the percentage CD26+CD35 (LSC) and CD26 CD35+ (HSC) cells within the LinCD34+CD38−/low compartment. Left panel: ADT gated CD26+CD35 and CD26CD35+ cells within LinCD34+CD38−/low CITE-seq data from selected optimal responder (CML4) and treatment failure (CML8) at diagnosis. Right panel: FACS gated LinCD34+CD38−/lowCD26+CD35 and CD26CD35+ cells in selected optimal responder (CML12) and treatment failure (CML13) at diagnosis versus 3 months of TKI therapy (Bosutinib).

(D) Bar plots showing the percentage of CD26+CD35 and CD26CD35+ cells within LinCD34+CD38−/low compartment from optimal responders and treatment failures at diagnosis (n = 11; optimal = 6 (≤ 0.1 %, CML patients 12, 14-18), failure = 5 (> 1 %, CML patients 13, 19, 22-23, 25) and following 3 months of Bosutinib therapy (n = 11; optimal = 6 (≤ 0.1 %, CML patients 12, 14-18), failure = 5 (> 1 %, CML patients 13, 19, 22-23, 24) determined using FACS.

(A) Distribution of annotated cell clusters across the two Lin-CD34+ nBM samples.

(B) Cell cycle scoring: G1, S, G2/M of UMAP embedded Lin-CD34+ cells from nBM using Scarf.

(C) Visualization of the relative mRNA (left panels) and ADT expression (right panels) of a selection of genes and surface markers within the Lin-CD34+ nBM reference.

(A) UMAP plots showing the mapping score for the individual LinCD34+ CML samples at diagnosis when mapped onto the nBM reference; mapping was performed using Scarf (Dhapola et al., 2022). Circle size indicates mapping score and circle color specify cluster identity of the reference cell.

(B) Bar plots showing the log2 fold change in cluster distribution (%) within the CML LinCD34+ compartment at diagnosis when compared to nBM for the individual CML patients.

(A) Volcano plots showing differentially expressed genes between LinCD34+ CML (n = 9) at diagnosis and nBM cells across clusters (DESeq2). Red and black dots represent genes found to be consistently significant up- or down-regulated across all cluster comparisons in CML, respectively (adjusted p-value < .01, log2 fold change = >1 / < −1). Vertical dotted line marks a log2 fold change equal to 1 and −1.

(B) The fold change in proportions for cell clusters between all non-responders and responders (CD34+ cells isolated from both PB and BM); statistical significance shown by asterisk *; student t-test, p-value < .05.

(C) Assessment of the CML related ADTs; CD25, CD26, CD93 capacity to capture primitive cells, and specific progenitors across patients by gating on their expression within the LinCD34+ compartment (x-axis shows specific clusters, y-axis: percentage of specific cluster captured; each white circle represents a patient sample).

(A) UMAP plots showing the mapping score for individual LinCD34+CD38−/low CML samples (CML2-8) at diagnosis and LinCD34+CD38−/low normal BM when mapped onto the LinCD34+ nBM reference; mapping was performed using Scarf. Circle size indicates mapping score and circle color specify cluster identity of the reference cell.

(B) UMAP embedding of LinCD34+CD38−/low primitive cluster CML (n = 8, a total of 2778 cells) and normal BM cells showing clusters following Leiden clustering analysis.

(C) UMAP embedding of LinCD34+CD38−/low primitive cluster CML (n = 8, a total of 2778 cells) at diagnosis and normal BM cells where cells are colored by sample identity.

(D) Same UMAP as in B-C above, showing the relative expression of BCR::ABL1+ LSC (left panel) and BCR::ABL1 LSC signature genes (right panel) used to define BCR::ABL1+ primitive cells (Signature genes are from DEG analysis comparing CP-CML LinCD34+CD38−/low BCR::ABL1+ and BCR::ABL1 (Giustacchini et al., 2017)).

(E) Same UMAP as in B-C above, showing the relative expression of BCR::ABL1+ LSC (left panel) and nBM HSC signature genes (right panel) used to define BCR::ABL1+ primitive cells (Signature genes are from DEG analysis comparing CP-CML LinCD34+CD38−/low BCR::ABL1+ and nBM LinCD34+CD38−/low cells (Giustacchini et al., 2017)).

(F) Quantification of LinCD34+CD38−/low primitive BCR::ABL1+ (red) and BCR::ABL1 (black) cells per CML patient at diagnosis. Total primitive cells per patient is indicated at the bottom.

(A) UMAP plots where CITE-seq data from LinCD34+ and LinCD34+CD38−/low sorted cells from the same CML patients have been merged, and LinCD34+CD38−/low cells are highlighted in blue.

(B) UMAP plots where CITE-seq data from LinCD34+ and LinCD34+CD38−/low sorted cells from the same CML patients have been merged, where LinCD34+CD38−/low primitive annotated cells are highlighted in blue.

(A) UMAP plots of LinCD34+ cells from CML patients 1-4 and 6-9, showing cells colored according to cluster identity given after mapping to nBM (left panel); the relative expression of LinCD34+CD38−/low Primitive BCR::ABL1+ signature genes (center panel) and the relative expression of LinCD34+CD38−/low Primitive BCR::ABL1 signature genes (right panel). DEG analysis is shown in Figure 4C and relative expression of signatures for CML5 is found in Figure 4D. (red = high expression, blue = low expression).

(A) UMAP plots of CML LinCD34+ primitive cluster cells from all CML patients (n = 9, a total of 1 106 cells) showing clusters following Leiden clustering analysis.

(B) UMAP plots of CML LinCD34+ primitive cluster from all CML patients (n = 9, a total of 1 106 cells) where the cells are colored as per the sample identity

(C) UMAP plots of CML LinCD34+ primitive cluster cells from all CML patients showing the relative expression of two DEG signatures used to define BCR::ABL1+ primitive cells: (1) BCR::ABL1+ LSC vs nBM HSC signature genes (top row, left panel = up-regulated genes in BCR::ABL1+ LSCs, right panel = genes up-regulated in nBM HSCs), (2) BCR::ABL1+ LSC vs BCR::ABL1 LSC (bottom row, left panel = genes up-regulated in BCR::ABL1+ LSC, right panel = genes up-regulated in BCR::ABL1 LSC) (Giustacchini et al., 2017).

(D) UMAP plots of LinCD34+ cells from individual CML patients (1-9) showing primitive cluster cells (yellow cluster, left panel) annotated as BCR::ABL1+ (red) and BCR::ABL1 (black)(right panel).

(E) Quantification of LinCD34+ primitive BCR::ABL1+ (red, cluster 2-3 Supp. Fig. 7A) and BCR::ABL1 (black, cluster 1 Supp. Fig. 7A) cells per CML patient at diagnosis. Total primitive cells are indicated at the bottom.

(A) UMAP plots of LinCD34+ cells from CML patients 1-4 and 6-8 showing the relative ADT expression of a selection of markers (CD25, CD26, CD93, CD117, CD35) (red = high expression, blue = low expression). ADT expression for CML patient 5 and 9 is shown in Fig. 5B.

(A) Plots showing representative gates on ADT expression for CD26, CD35, CD117, CD25, CD93 within the LinCD34+ CITE-seq data using FlowJo. Gates are shown for CML5 and CML9.

(B) Plots showing the gating on CD26 and CD35 ADT expression within the Lin CD34+CD38−/low CITE-seq data for CML patients 2-3, 5-7 and 9 using FlowJo. Plots for CML patient 4 and 8 is shown in Fig. 6D.

(C) Cumulative plots of time to first division kinetics from LinCD34+CD38−/low CD45RACD26+CD35 and CD26CD35+ single cells for two individual CML patients at diagnosis (n = 2, CML patients 10 and 11). Dead cells and non-dividing cells were excluded from the analysis.

(D) Mean time to first division (hours) for LinCD34+CD38−/lowCD45RA CD26+CD35 or CD26CD35+ single cells from the two CML patients shown in C (n = 2, CML patients 10 and 11) (p-value = ns., .84; paired t-test).

(E) Mean cell cycle status of ADT gated CD26+CD35 or CD26CD35+ cells, after running Scarf’s cell cycle scoring on the LinCD34+CD38−/low CITE-seq data for patients individually (n = 8, CML2-9).

(F) Mean cell cycle status of ADT gated CD26+CD35 or CD26CD35+ cells, after running Scarf’s cell cycle scoring on LinCD34+CD38−/low Primitive cluster for patients individually (n = 8, CML2-9).