Single-cell atlas of AML reveals age-related gene regulatory networks in t(8;21) AML

  1. Jessica Whittle
  2. Stefan Meyer
  3. Georges Lacaud  Is a corresponding author
  4. Syed Murtuza-Baker  Is a corresponding author
  5. Mudassar Iqbal  Is a corresponding author
  1. Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
  2. Stem Cell Biology Group, Cancer Research UK Manchester Institute, The University of Manchester, United Kingdom
  3. Manchester Cancer Research Centre (MCRC), Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, United Kingdom
  4. Department of Paediatric and Adolescent Oncology, Royal Manchester Children’s Hospital, United Kingdom
  5. Department of Adolescent Oncology, The Christie NHS Foundation Trust, United Kingdom
5 figures and 3 additional files

Figures

Figure 1 with 2 supplements
Large scale data integration creates a single-cell atlas of acute myeloid leukemia (AML).

(A) Overview of the analysis steps in creating AML scAtlas. (B) Proportion of cells (left panel) and samples (right panel) belonging to each AML subtype as defined by the European Leukemia Net (ELN) clinical guideline. (C) Age group and gender distribution of AML single-cell Atlas (scAtlas) cohort samples. (D) scVI harmonized UMAP colored by annotated cell types. (E) The expression of key hematopoietic marker genes across annotated cell types shown on a dotplot. Color scale shows mean gene expression, dot size represents the fraction of cells expressing the given gene.

Figure 1—figure supplement 1
Initial analysis establishes presence of batch effects.

(A) Initial dimensionality reduction and UMAP plotting prior to batch correction of the 748,679 high quality cells. (B) Visualization of key hematopoietic marker genes on the uncorrected UMAP. (C) Representative study examples (Petti et al., 2019; Zhang et al., 2023) investigating batch effects. UMAPs of different samples in each study (top panels), and hematopoietic marker genes (bottom panels; Petti et al., 2019; top, Zhang et al., 2023 bottom).

Figure 1—figure supplement 2
Benchmarking batch correction methods.

(A) Dimensionality reduction and UMAP visualization using the batch corrected embeddings for scVI (left), Harmony (middle), and scANVI (right) shows improved integration of different studies in all cases. (B) Visualization of hematopoietic marker genes on the UMAP for Harmony (top) and scANVI (bottom), shows improved harmonization of cell types following batch correction. (C) Projection of original publication cell type annotations, where available, onto UMAP plots for scVI (left), Harmony (middle), and scANVI (right).

Figure 2 with 1 supplement
Characterizing cell type distributions in acute myeloid leukemia (AML) subtypes.

(A) UMAP highlighting the distribution of cells from different AML subtypes in AML single-cell Atlas (scAtlas). (B) Schematic showing the workflow used to identify leukemic stem cells (LSCs) from the AML scAtlas hematopoietic stem and progenitor cell (HSPC) clusters. (C) Using the AML scAtlas HSPC clusters only, UMAP was regenerated and annotated with an AML-specific reference of leukemia stem and progenitor cells (LSPCs). (D) UMAPs showing the leukemic stem cell scores of each cell, for the LSC17 (left) and LSC6 (right). (E) Proportions of HSPC/LSPC populations in different AML subtypes (left) and AML risk groups (right), as defined by European Leukemia Net (ELN) clinical guidelines. (F) Comparison of LSC abundance in favourable and adverse ELN risk groups. Chi-Square test statistic: 8658.98, degrees of freedom: 1, p-value: 0.0.

Figure 2—figure supplement 1
Cell type proportions vary by acute myeloid leukemia (AML) subtype.

(A) Comparison of cell type abundance across different AML subtypes, shown as absolute values. (B) Comparison of cell type abundance across different AML subtypes, shown as cell type proportions.

Figure 3 with 1 supplement
Acute myeloid leukemia (AML) single-cell Atlas (scAtlas) reveals age-associated heterogeneity in t(8;21) AML.

(A) Depiction of the workflow to generate and validate the t(8;21) AML gene regulatory network (GRN) from AML scAtlas. (B) Using the AML scAtlas t(8;21) sample cells, UMAP was re-computed and shows the different cell types. (C) Bar plots of the absolute cell type numbers (left panel) and the cell type proportions (right panel) stratified by age group. The CD34 enrichment performed on several adult samples is reflected. (D) Using HSPCs and CMPs only, the pySCENIC gene regulatory network (GRN) and regulon AUC scores were calculated. Z-score normalized scores underwent hierarchical clustering to create a clustered heatmap and identify age-associated regulons. Regulons were prioritized using their regulon specificity scores (RSS).

Figure 3—figure supplement 1
Acute myeloid leukemia (AML) with t(8;21) pySCENIC analysis.

(A) XIST/ChrY expression in samples from patients with no recorded gender. (B) Regulon specificity scores (RSS) for pySCENIC regulons in each age group of the t(8;21) AML data analyzed. (C) Clustered heatmap of the AUC values (Z-score normalized) calculated using pySCENIC. After selecting for HSPCs, regulons were chosen based on their regulon specificity score (RSS). (D) Percentage overlap of regulon transcription factors (TFs) between individual studies with t(8;21) AML samples, when performing pySCENIC on each individually. (E) Overlap between the combined regulon TFs from individual study-wise iterations of pySCENIC and the integrated AML scAtlas dataset.

Figure 4 with 1 supplement
Validation of age-associated regulons in large bulk RNA-seq cohorts.

(A) Using previously defined age-associated regulons, pySCENIC AUC scores (Z-score normalized) were clustered to identify bulk RNA-seq samples (n=83) most enriched for inferred-prenatal and inferred-postnatal origin signatures. (B) Volcano plot of differentially expressed genes when comparing the inferred-prenatal origin (n=31) and inferred-postnatal origin (n=27) samples. Adjusted p-value threshold 0.01; log2 fold change threshold 0.5. Regulon signature associated transcription factors (TFs) are indicated. (C) Enrichment plot of significant gene sets enriched in the inferred-prenatal origin samples. GSEA was performed on the DEGs using MSigDB databases. FDR q-value threshold <0.05. (D) Enrichment plot of drug sensitivity gene sets enriched in the inferred-prenatal samples. GSEA was performed on the DEGs, using drug response signatures from published studies of four widely used acute myeloid leukemia (AML) drugs. FDR q-value threshold <0.05. (E) The predicted cell type proportions estimated using AutoGeneS deconvolution, of the inferred-prenatal (n=31)and inferred-postnatal origin samples (n=27) were compared using t-tests. Significant p-values <0.05 (*), <0.01 (**), <0.001 (***), and <0.0001 (****) are indicated.

Figure 4—figure supplement 1
Validation of pySCENIC regulons.

(A) Differential gene expression volcano plot comparing the inferred-prenatal and inferred-postnatal bulk RNA-sequencing samples, as performed by DESeq2. (B) Venn diagram highlighting the intersect between regulon transcription factors (TFs), and two independent methods of differential gene expression. (C) Heatmap of regulon-associated TFs and their log-normalized gene expression values across the samples in each group (prenatal origin versus postnatal origin). (D) Using the t(8;21) acute myeloid leukemia (AML) data from AML scAtlas, median absolute deviation (MAD) thresholding was used to select cells enriched for the inferred-prenatal origin (top) and inferred-postnatal origin (bottom) signatures. (E) Using the AML single-cell Atlas (scAtlas) cell type annotations from the hematopoietic stem/progenitor cell (HSPC)/leukaemic stem and progenitor cell (LSPC) reference dataset, cell type proportions in the inferred-prenatal origin signature cells were compared to the postnatal origin cells.

Figure 5 with 2 supplements
Combining multiomics data interrogates age-associated regulons.

(A) SCENIC+ eRegulon dot plot of showing correlation between single-cell RNA sequencing (scRNA-seq) target gene activity (indicated by the color scale) and scATAC-seq target region accessibility (depicted by spot size). Regulon specificity score (RSS) identified the key activating eRegulons (+/+) between inferred-prenatal and inferred-postnatal origin disease and allows comparison of diagnosis (Dx) and relapse (Rel) time points. (B) Network showing the inferred-prenatal (blue) and inferred-postnatal (orange) associated eRegulons. Node size represents the number of target genes in each regulon. Edges represent interactions between nodes. (C) Over-representation analysis of age-associated eRegulon target genes using Gene Ontology (GO) Biological Processes curated gene sets. Adjusted p-value threshold 0.05. (D) Principal components analysis (PCA) of the gene based eRegulon enrichment scores for the inferred-prenatal origin disease at diagnosis and relapse. PC1 axis explains variance occurring between diagnosis and relapse, where this patient underwent a lineage switch. PC2 captures variance related to hematopoietic differentiation. (E) SCENIC+ perturbation simulation shows the predicted effect of knockout of selected transcription factors (TFs) on the previously computed PCA embedding. Arrows indicate the predicted shift in cell states relative to the initial PCA embedding.

Figure 5—source data 1

SCENIC+ eRegulons for Lambo et al., 2023 t(8;21) acute myeloid leukemia (AML) samples.

https://cdn.elifesciences.org/articles/104978/elife-104978-fig5-data1-v1.csv
Figure 5—figure supplement 1
Multiomics data of t(8;21) acute myeloid leukemia (AML) Rrefines gene regulatory network (GRN).

(A) Using the Lambo et al dataset, age-associated regulon activity was calculated using pySCENIC AUCell. This identified patient samples highly enriched for the inferred-prenatal and inferred-postnatal origin signatures. (B) Plot showing the correlation scores between the single-cell RNA sequencing (scRNA-seq) and scATAC-seq-derived eRegulons. Correlation thresholds were used to prioritize eRegulons which correlate across modalities (highlighted in blue). (C) Correlation plot of eRegulon target genes shows clusters of related eRegulons with common targets. (D) SCENIC+ analysis identified a range of patient and cell-type-specific eRegulons. Dot plot shows all direct eRegulons inferred by SCENIC+ after initial filtering steps, split into patient-associated cell type populations. This shows many eRegulons with a high correlation between scRNA-seq target gene activity (indicated by the color scale) and scATAC-seq target region accessibility (depicted by spot size).

Figure 5—figure supplement 2
SCENIC+ analysis identifies age-associated candidate perturbations in t(8;21) acute myeloid leukemia (AML).

(A) Gene ontology over representation analysis of regulon target gene clusters, defined from the co-binding correlation map as inferred-prenatal or inferred-postnatal. Gene Ontology (GO) molecular function gene sets used with an adjusted p-value threshold of 0.05. (B) SCENIC+ perturbation simulation infers the predicted effect of knockout of selected transcription factors (TFs) on the previously computed PCA embedding for the inferred-prenatal origin sample (AML16). Heatmap shows the predicted effect on PC1 (top) and PC2 (bottom) for the TFs with the largest predicted effect across all cell types. (C) SCENIC+ perturbation modelling results for the prenatal origin sample. Prioritized TFs based on predicted shift on the HSC compartment at diagnosis below –0.3. (D) Principal components analysis (PCA) of the gene-based eRegulon enrichment scores for the inferred-postnatal origin samples at diagnosis and relapse. PC1 explains variance occurring between diagnosis and relapse. PC2 captures variance related to hematopoietic differentiation, split into myeloid and lymphoid trajectories. (E) SCENIC+ perturbation simulation results for the inferred-postnatal origin sample (AML12). Heatmap shows the predicted effect on PC1 (top) and PC2 (bottom) for the TFs with the largest predicted effect across all cell types. (F) SCENIC+ perturbation simulation shows the predicted effect of knockout of selected TFs on the previously computed PCA embedding. Arrows indicate the predicted shift in cell states relative to the initial PCA embedding. (G) DepMap CRISPR dependency scores for BCLAF1 (top) and EP300 (bottom) and as potential therapeutic targets identified in the prenatal t(8;21) AML sample, with relevant t(8;21) AML cell lines indicated. (H) DepMap CRISPR dependency scores for BCLAF1 (left) and EP300 (right), ranked for all cell lines (n=1178).

Additional files

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Jessica Whittle
  2. Stefan Meyer
  3. Georges Lacaud
  4. Syed Murtuza-Baker
  5. Mudassar Iqbal
(2026)
Single-cell atlas of AML reveals age-related gene regulatory networks in t(8;21) AML
eLife 14:RP104978.
https://doi.org/10.7554/eLife.104978.3