Single-Cell Atlas of AML Reveals Age-Related Gene Regulatory Networks in t(8;21) AML

Jessica Whittle; Stefan Meyer; Georges Lacaud; Syed Murtuza Baker; Mudassar Iqbal

doi:10.7554/eLife.104978.2

eLife Assessment

This manuscript provides a single-cell transcriptomic atlas for AML (222 samples comprising 748,679 cells) integrating data from multiple studies. They use this dataset to investigate t(8;21) AML, and they reconstruct the Gene Regulatory Network and enhancer Gene Regulatory Network, which allowed identification of interesting targets. This aggregation is important and can help infer differences in genetic regulatory modules based on the age of disease onset. Their compelling effort may help explain age-related variations in prognosis and disease development in subtype-specific manner.

https://doi.org/10.7554/eLife.104978.2.sa2

Significance of findings

important: Findings that have theoretical or practical implications beyond a single subfield

landmark
fundamental
important
valuable
useful

Strength of evidence

compelling: Evidence that features methods, data and analyses more rigorous than the current state-of-the-art

exceptional
compelling
convincing
solid
incomplete
inadequate

During the peer-review process the editor and reviewers write an eLife assessment that summarises the significance of the findings reported in the article (on a scale ranging from landmark to useful) and the strength of the evidence (on a scale ranging from exceptional to inadequate). Learn more about eLife assessments

Abstract

Background

Acute myeloid leukemia (AML) is characterized by cellular and genetic heterogeneity, which correlates with clinical course. Although single-cell RNA sequencing (scRNA-seq) reflects this diversity to some extent, the low sample numbers in individual studies limit the analytic potential when comparing specific patient groups.

Results

We performed large scale integration of published scRNA-seq datasets to create a unique single-cell transcriptomic atlas for AML (AML scAtlas), totaling 748,679 cells, from 159 AML patients and 44 healthy donors from 20 different studies. This is the largest single-cell data resource for AML to our knowledge, publicly available at https://cellxgene.bmh.manchester.ac.uk/AML/. This AML scAtlas allowed investigations into 20 patients with t(8;21) AML, where we explored the clinical importance of age, given the in-utero origin of pediatric disease. We uncovered age-associated gene regulatory network (GRN) signatures, which we validated using bulk RNA sequencing data to delineate distinct groups with divergent biological characteristics. Furthermore, using an additional multiomic dataset (scRNA-seq and scATAC-seq), we validated our initial findings and created a de-noised enhancer-driven GRN reflecting the previously defined age-related signatures.

Conclusions

Applying integrated data analysis of the AML scAtlas, we reveal age-dependent gene regulation in t(8;21) AML, potentially reflecting immature/fetal HSC origin in prenatal origin disease vs postnatal origin. Our analysis revealed that BCLAF1, which is particularly enriched in pediatric AML with t(8;21) of inferred in-utero origin, is a promising prognostic indicator. The AML scAtlas provides a powerful resource to investigate molecular mechanisms underlying different AML subtypes.

Introduction

Acute myeloid leukemia (AML) is an aggressive blood cancer driven by non-random genomic rearrangements in hematopoietic stem/progenitor cells (HSPCs). Recurrent AML-associated genomic aberrations, which often involve transcriptional or epigenetic regulators, give rise to distinct patterns of gene expression strongly associated with clinical course and chemotherapy response ^{[1, 2]}. Single-cell RNA sequencing (scRNA-seq) studies have demonstrated that HSPCs acquire lineage priming at an early stage when still phenotypically immature and disperse down an erythromyeloid or lymphomyeloid differentiation trajectory ^[3]. In the context of AML, diverse clonal hierarchies include the co-existence of normal hematopoietic clones. Leukemic clones can partially recapitulate myeloid differentiation and have been shown to display functional differences even when defined by the same genotype ^[4-7]. Indeed, analysis of AML using scRNA-seq has revealed key clonal hierarchies, defining subtype-associated cell types, and dynamic changes following therapy, and have been critical in characterizing leukemic stem cells (LSCs), which propagate the disease and drive relapse ^[5-8].

Most AML scRNA-seq studies are limited by small sample numbers and include a mixture of different AML subtypes which may not be directly comparable to one another. Therefore, it is difficult to make biological conclusions with sufficient robustness to be clinically translatable in these individual datasets. To overcome this, we performed large-scale integration of public scRNA-seq datasets to create a single-cell transcriptomic atlas for AML (AML scAtlas). Due to the range of data sources spanning time, locations, and experimental designs, complex batch effects often arise between scRNA-seq datasets which requires a tailored data integration approach ^{[9, 10]}. Thus, we benchmarked some widely used batch correction tools ^[11-13] for our specific data use case.

Given the broad representation of age groups in AML scAtlas, we sought to investigate a developmental aspect of AML biology. Pediatric AMLs have substantially better clinical outcomes compared to adult AMLs^[14-16]. The molecular landscape of AML differs between children and adults^{[2, 14-16]}; this may, in part, reflect differences in the developmental origins of the disease. Chromosomal changes in pediatric leukemia are acquired in-utero, as evidenced by leukemia-specific genomic aberrations detected in the Gunthrie spots of children who later developed leukemia, sometimes several years after birth ^[17]. Adult leukemia, in contrast, is thought to develop later in life through acquisition of pre-leukaemic changes and clonal evolution of adult HSPCs ^{[18, 19]}. The impact of these developmental stages on leukemia biology remains incompletely understood, and no current methods exist to quantify and characterize differences in the origin of the disease. However, as childhood AML with presumed in-utero origin has a better outcome, for teenagers and young adults, determination of the pre- or postnatal origin might be important for better treatment stratification and prognostication.

AML with t(8;21) (AML-ETO/RUNX1-RUNX1T1) is one of the most frequent AML subtypes in young people, although it affects all ages ^[2]. The prenatal origins of the t(8;21) rearrangement, has been confirmed even in older children presenting with AML ^[17]. The prognosis of AML with t(8;21) is better in children than in teenagers and even more so than in young adults ^[20]. This outcome difference is not fully explainable by co-morbidities and may instead be related to the developmental origins of the disease. In the intermediate teenage group, t(8;21) AML may comprise both late childhood and early adult disease entities, a distinction that could have prognostic implications and could help to explain disease biology and clinical course.

We leveraged our AML scAtlas resource to characterize age and developmental stage specific signatures in t(8;21) AML by applying single-cell gene regulatory network (GRN) inference^{[21, 22]}, as a means of revealing cell state heterogeneity across age groups. We then validated and refined our findings in a larger cohort using bulk RNA sequencing (RNA-seq) data from the TARGET^[2] and BeatAML^[23] studies, defining age-associated GRN signatures and key regulators of t(8;21) AML, that may reflect the developmental origins of the leukemia.

Profiling both gene expression and chromatin accessibility together can decipher the enhancer-driven GRN (eGRN) and enriched transcriptional regulators. Significant heterogeneity across different patients and time points ^[24] was recently described by analyzing combined scRNA-seq and single-cell Assay for Transposase Accessible Chromatin sequencing (scATAC-seq). We used the t(8;21) AML data from this study to validate our initial findings, by applying cutting edge GRN inference methodology ^[25]. This encompasses both modalities to provide a denoised eGRN which we could correlate with our age-associated signatures.

Results

Large Scale Data Integration to Construct a Single-Cell Transcriptomic Atlas of AML (AML scAtlas)

To create the AML scAtlas, we integrated published scRNA-seq data of primary AML bone marrow samples, from 16 suitable high-quality studies (see Materials and Methods), comprising 159 AML samples (Figure 1A; Supplementary Table 1). Where on-treatment time points were available, we selected only diagnostic samples to establish a reference atlas of primary AML at diagnosis. If studies had healthy donor bone marrow samples, these were included, alongside data from healthy bone marrow samples from four additional scRNA-seq studies (Supplementary Table 1) to enable comparisons between malignant and healthy bone marrow populations. After cell filtering and quality control, the AML scAtlas contains data from 748,679 high quality cells derived from a total of 20 different scRNA-seq studies ^{[4-6, 8, 26-40]} (Supplementary Figure 1A). Each sample was assigned to an AML clinical subtype, based on the recent European Leukemia Net (ELN) clinical guidelines ^[41], and classified into the corresponding prognostic risk group. This resource captures a broad range of molecular subtypes of AML and spans different age groups, including both pediatric and adult AML cases (Figure 1B;1C). Overall, this is the largest dataset to date for exploring AML biology at single-cell resolution.

Large Scale Data Integration Creates a Single-Cell Atlas of AML
(A) Overview of the analysis steps in creating AML scAtlas. (B) Proportion of cells (left panel) and samples (right panel) belonging to each AML subtype as defined by the ELN clinical guideline. (C) Age group and gender distribution of AML scAtlas cohort samples. (D) scVI harmonized UMAP colored by annotated cell types. (E) The expression of key hematopoietic marker genes across annotated cell types shown on a dotplot. Color scale shows mean gene expression, dot size represents the fraction of cells expressing the given gene.

In the initial analysis of the combined dataset batch effects were noted, with study-specific clustering, which was quantified using several benchmarking metrics (Supplementary Table 2; Supplementary Figure 1A;1B). Even within samples of the same study, sample-wise clustering was noted (Supplementary Figure 1C). To address this, we benchmarked several widely used batch correction methods (Supplementary Table 2; Supplementary Figure 2A;2C), identifying scVI as the best method for this dataset (Supplementary Table 2). We therefore employed scVI to correct for batch effects, before clustering and cell type annotation in the AML scAtlas, by using the consensus of multiple annotation tool results (Supplementary Table 3), verified using cluster-wise marker gene expression (Figure 1D;1E).

Characterizing Cell Type Distributions in AML Subtypes
(A) UMAP highlighting the distribution of cells from different AML subtypes in AML scAtlas. (B) Schematic showing the workflow used to identify leukemic stem cells (LSCs) from the AML scAtlas hematopoietic stem and progenitor cell (HSPC) clusters. (C) Using the AML scAtlas HSPC clusters only, UMAP was regenerated and annotated with an AML-specific reference of leukemia stem and progenitor cells (LSPCs). (D) UMAPs showing the leukemic stem cell scores of each cell, for the LSC17 (left) and LSC6 (right). (E) Proportions of HSPC/LSPC populations in different AML subtypes (left) and AML risk groups (right), as defined by ELN clinical guidelines. (F) Comparison of LSC abundance in favourable and adverse ELN risk groups. Chi-Square test statistic: 8658.98, degrees of freedom: 1, P-value: 0.0.

Cell type proportions analyses across the clinically relevant subtypes in the dataset show that the AML subtypes were significantly biased towards myeloid cell types (CMP, MEP, GMP, ProMono, CD14+ Mono, CD16+ Mono, cDC, Erythroid) with each subtype exhibiting a clear predominant cell type consistent with AML clonal expansion (Figure 2A; Supplementary Figure 3A). In contrast, healthy donor samples had more balanced lineage proportions, with lymphoid cells (T, B, NK, ProB, pDC, Plasma) well represented (Figure 2A; Supplementary Figure 3A). Given the established critical role of HSPCs and LSCs in propagating AML, and their importance as therapeutic targets ^[42], we focused on HSPC clusters for further analysis (Figure 2B). To identify LSCs, we applied a curated reference profile of leukaemic stem and progenitor cells (LSPCs) ^[7] (Figure 2B;2C) and correlated this with calculated LSC6 ^[43] and LSC17^[44] scores for each cell (Figure 2D). We then compared the proportions of HSPC/LSPCs across different AML subtypes and risk groups, as defined by the ELN clinical guidelines ^[41] (Figure 2E). Higher-risk subtypes displayed a higher proportion of LSCs compared to favorable risk disease (Figure 2E;2F).

AML scAtlas Reveals Age-Associated Heterogeneity in t(8;21) AML
(A) Depiction of the workflow to generate and validate the t(8;21) AML gene regulatory network (GRN) from AML scAtlas. (B) Using the AML scAtlas t(8;21) sample cells, UMAP was re-computed and shows the different cell types. **(C)**Bar plots of the absolute cell type numbers (left panel) and the cell type proportions (right panel) stratified by age group. The CD34 enrichment performed on several adult samples is reflected. (D) Using HSPCs and CMPs only, the pySCENIC gene regulatory network (GRN) and regulon AUC scores were calculated. Z-score normalized scores underwent hierarchical clustering to create a clustered heatmap and identify age-associated regulons. Regulons were prioritized using their regulon specificity scores (RSS).

Application of AML scAtlas to Identifying Age-Associated Gene Regulatory Networks in t(8;21) AML

The AML scAtlas enables robust comparison of adult and pediatric AML. We hypothesized that in adolescents and young adults with t(8;21) AML, the potential for either in-utero or postnatal HSPC origin disease might affect disease biology and prognosis. Thus, we sought to explore biological differences between pediatric and adult cases of t(8;21) AML, aiming to explain and potentially improve prognostication in adolescents and young adults. We selected samples with t(8;21) AML from the AML scAtlas, resulting in 105,663 cells from 13 adult cases (aged 20-67), 7 adolescent cases (aged 12-17), and 3 pediatric cases (aged 6-8) (Figure 3A-3C). Where gender information was not available, this was inferred from ChrY/XIST gene expression (Supplementary Figure 4A). Several adult samples underwent CD34 selection in original studies, excluding more differentiated cell types (mature lymphoid populations, monocytes, granulocytes) in these samples. Thus, these cell types were excluded from comparative analysis, focusing only on HSPCs and myeloid progenitors (CMP, GMP, MEP), which were well represented in all studies (Figure 3C).

Validation of Age-Associated Regulons in Large Bulk RNA-Seq Cohorts
**(A)** Using previously defined age-associated regulons, pySCENIC AUC scores (Z-score normalized) were clustered to identify samples most enriched for inferred-prenatal and inferred-postnatal origin signatures. (B) Volcano plot of differentially expressed genes when comparing the inferred-prenatal origin and inferred-postnatal origin samples. Adjusted P value threshold 0.01; log2 fold change threshold 0.5. Regulon signature associated TFs are indicated. (C) Enrichment plot of significant gene sets enriched in the inferred-prenatal origin samples. GSEA was performed on the DEGs using MSigDB databases. FDR q-value threshold <0.05. (D) Enrichment plot of drug sensitivity gene sets enriched in the inferred-prenatal samples. GSEA was performed on the DEGs, using drug response signatures from published studies of 4 widely used AML drugs. FDR q-value threshold <0.05. (E) The predicted cell type proportions estimated using AutoGeneS deconvolution, of the inferred-prenatal and inferred-postnatal origin samples were compared using T-Tests. Significant P values <0.05 (*), <0.01 (**), <0.001 (***) and <0.0001 (****) are indicated.

We reconstructed the GRN for the t(8;21) subset using the pySCENIC ^[22] pipeline, which is a python-based efficient implementation of original SCENIC method ^[21]. It is a state-of-the-art method for network inference from scRNA data, popular in the community ^[45-47] and has shown strong performance in a recent benchmarking study ^[48]. SCENIC’s three major steps are: First, it identifies groups of co-expressed genes as potential targets of a transcription factor (TF). Second, it filters these groups of genes to retain only TF targets with the corresponding binding motif, forming “regulons.” Third, it uses the AUCell method (embedded within SCENIC) to quantify the activity of each regulon in every cell. AUCell calculates the Area Under the Curve (AUC) for the regulon’s genes set in a ranking of all genes by expression for each cell. The top 20 regulons for each age groups were selected based on the regulon specificity score (RSS) (Supplementary Figure 4B). Unsupervised clustering on the Z score normalized regulon activity score matrix revealed clear differences in the GRN across different age groups (Supplementary Figure 4C). We hypothesize that the differences in the GRN might reflect differences in the pre- or postnatal developmental origins of the disease. Additional testing of GRN inference from individual studies shows that the high number of cells refines the overall GRN (see Methods; Supplementary Figure 4D-E).

To define gene regulatory programs (co-occurring gene modules, defined by a transcription factor and its targets) which are specific to different age groups (termed ‘regulon signature’), we used the clustered dendrogram to select the regulon clusters most associated with the pediatric (below 10 years-old) and adult samples (over 18 years-old) (Figure3D; Supplementary Figure 4C). The pediatric regulon signature, proposed to represent in-utero origin t(8;21) AML (henceforth termed ‘inferred-prenatal’), includes 16 regulons defined by a distinct group of hematopoietic transcription factors (TFs) (TRIM28, CTCF, RAD21, SOX4, TAL1, MYB, FOXN3, JUND, BCLAF1, ZBTB7A, IKZF1, MAZ, REST, YY1, CUX1, KDM5A), many of which have clearly defined roles in HSPCs and AML ^[49-51].The adult regulon signature, presumed representative of the postnatally acquired t(8;21) AML (henceforth termed ‘inferred-postnatal), combines 3 discrete clusters of regulons (YBX1, ENO1, and HDAC2; GATA1, POLE3, TFDP1, MYBL2, E2F4, and KLF1; IRF1, STAT1, IRF7, MAFF, ATF4, TAGLN2, SPI1, and KLF2), defined by TFs previously implicated in various hematopoietic, leukemic and inflammatory processes ^[52-54]. Importantly, both signatures contain key components of the AP-1 complex, which is heavily implicated in the biology of t(8;21) AML ^{[55, 56]} and undergoes dynamic changes during aging ^[57]. Samples of 6 individuals aged 12-17 clustered with the pediatric samples and showed enrichment for the inferred-prenatal signature (Figure 3D), suggesting that older adolescents (up to aged 17 in our cohort) more closely resemble pediatric AML with t(8;21) and remain biologically distinct from adult-onset disease. This implies that the inferred in-utero origin of t(8;21) AML can also be present in AML diagnosed in older children.

Validation of Age-Associated Regulons in Bulk-RNA-Seq Cohorts of t(8;21) AML

We next sought to externally validate our age-associated regulon signatures in a larger cohort of patients. Bulk RNA-seq samples were obtained from the TARGET ^[2] and BeatAML ^[23] cohorts, selecting bone marrow samples taken at diagnosis in line with AML scAtlas data (n=83; Supplementary Table 4). We applied the AUCell algorithm from pySCENIC ^[22] to calculate the activity of our pediatric inferred-prenatal and adult inferred-postnatal regulons in each sample. Unsupervised clustering of the bulk RNA-seq AUCell results revealed discrete clusters of samples that were highly enriched for our inferred-prenatal and inferred-postnatal origin-associated regulons (Figure 4A).

Given the limitations of most scRNA-seq platforms in detecting lowly expressed genes, notably TFs, we leveraged bulk RNA-seq samples to refine our identified gene regulatory networks by detecting differentially expressed regulon-associated TFs. We used our inferred-prenatal and inferred-postnatal signature clusters and performed differential gene expression analysis between these samples, using two widely used tools (DESeq2^[59] and edgeR^[58]) to ensure robustness of the results (Figure 4B; Supplementary Figure 5A). We then compared differentially expressed regulon-associated TFs between the two groups and intersected this with the differential genes detected by each method. Although changes in TF expression are subtle (Figure 4B; Supplementary Figure 5A), we identify significantly differentially expressed TFs which reflect the observed differences in regulon activity and indicate the most critical regulons in our age-related GRN signatures (Figure 4A; Supplementary Figure 5B). This further delineated the inferred-prenatal signature to 5 key TFs (KDM5A, REST, BCLAF1, YY1, and RAD21), and the inferred-postnatal signature to 8 TFs (ENO1, TFDP1, MYBL2, TAGLN2, KLF2, IRF7, SPI1, and YBX1).

Combining Multiomics Data Interrogates Age-Associated Regulons
**(A)** SCENIC+ eRegulon dotplot of showing correlation between scRNA-seq target gene activity (indicated by the color scale) and scATAC-seq target region accessibility (depicted by spot size). RSS identified the key activating eRegulons (+/+) between inferred-prenatal and inferred-postnatal origin disease and allows comparison of diagnosis (Dx) and relapse (Rel) time points. (B) Network showing the inferred-prenatal (blue) and inferred-postnatal (orange) associated eRegulons. Node size represents the number of target genes in each regulon. Edges represent interactions between nodes. (C) Over-representation analysis of age-associated eRegulon target genes using GO Biological Processes curated gene sets. Adjusted P value threshold 0.05. (D) Principal components analysis (PCA) of the gene based eRegulon enrichment scores for the inferred-prenatal origin disease at diagnosis and relapse. PC1 axis explains variance occurring between diagnosis and relapse, where this patient underwent a lineage switch. PC2 captures variance related to hematopoietic differentiation. (E) SCENIC+ perturbation simulation shows the predicted effect of knockout of selected TFs on the previously computed PCA embedding. Arrows indicate the predicted shift in cell states relative to the initial PCA embedding.

We next performed gene set enrichment analysis (GSEA) on significantly differentially expressed genes as determined by edgeR^[58], to investigate pathways enriched in the inferred-prenatal samples compared to the inferred-postnatal ones (Figure 4C). Notably, inferred-prenatal samples showed increased expression of stemness-associated genes, and SMARCA2 target genes, a key player in HSC gene expression regulation and chromatin remodeling ^[60]. SMARCA2 is also known to be upregulated during the fetal-to-adult HSC transition ^[61], implying that the observed SMARCA2 enrichment may indeed reflect the inferred fetal HSC cell-of-origin. Genes impacted by YY1 depletion were also downregulated compared to the samples of inferred-postnatal leukemia origin, which supports the identification of YY1 as an inferred-prenatal regulon (Figure 4C). To explore therapeutic implications, we performed GSEA using drug response signatures from published studies ^[62-65] (Figure 4D). This analysis revealed that inferred-prenatal origin t(8;21) AML is enriched for genes associated with increased chemosensitivity to cytarabine, venetoclax, and daunorubicin (Figure 4D).

We hypothesized that the increase in stemness-associated genes in the leukemia with the inferred-prenatal origin could be reflective of potential differences in the leukemic cell-of-origin and its impact on myeloid differentiation. We therefore performed cell type deconvolution using AutoGeneS ^[66], with a curated LSPC reference profile ^[7], to compare the cellular heterogeneity between prenatal and postnatal origin bulk RNA-seq samples (Figure 4E). This revealed a higher proportion of HSPC cell types (HSC, Prog), with a reduction in some differentiated myeloid cell types (ProMono-like, cDC-like) in the samples of inferred-prenatal origin (Figure 4E). To corroborate this finding, we examined cell type proportions in the original t(8;21) subset of AML scAtlas, confirming that cells with the inferred-prenatal signature comprise more HSCs than inferred-postnatal signature cells (Supplementary Figure 5D-E). However, comparison of cell type proportions in this dataset is confounded by differences in sample processing as some studies performed CD34 selection, hence there is more cell type diversity observed in the pediatric samples (Supplementary Figure 5D-E).

Multiomics Single-Cell Data Reveals a Denoised GRN and Identifies Candidate Perturbations in Prenatal Origin t(8;21) AML

We next used the scRNA-seq and scATAC-seq data from a recent cohort of pediatric t(8;21) AML patients ^[24] at multiple clinical time points to uncover the enhancer-driven GRN (eGRN) in inferred-prenatal and inferred-postnatal origin t(8;21) AML (Supplementary Table 4). Initially, we identified two representative samples of our inferred-prenatal and inferred-postnatal signatures by using pySCENIC AUCell ^[22] to measure the activity of our previously defined regulons. Unsupervised clustering of the AUC scores was used to infer whether each sample matched the regulon signatures, identifying one inferred-prenatal sample and one inferred-postnatal sample for downstream analysis (Supplementary Figure 6A).

We then applied SCENIC+ ^[25], which integrates scRNA-seq and scATAC-seq to identify candidate enhancer regions and TF-binding motifs, linking TFs to target genes and identified enhancers. This creates enhancer-driven regulons (eRegulons), forming an eGRN. We applied SCENIC+ ^[25] to the leukemia samples with the inferred-prenatal and inferred-postnatal origin at diagnosis and relapse, keeping only regulons that showed a correlation between both modalities to retain only the most robust regulons (Supplementary Figure 6B). This revealed several eRegulons across both patients (Supplementary Figure 6D), many of which were patient specific, particularly when comparing HSPC populations (Figure 4A). The inferred-prenatal sample displayed a specific HSC eRegulon profile. In contrast, the inferred-postnatal sample more closely resembled the corresponding Granulocyte-Monocyte Progenitor (GMP) (Figure 5A). Interestingly, at relapse the inferred-prenatal origin patient undergoes a chemotherapy-driven lineage switch to a lymphoid phenotype, which may suggest that the leukemia originated from a less committed progenitor (Figure 5A).

To identify clusters of closely related eRegulons, we computed the correlations between eRegulons enrichment. We identified 2 main clusters of eRegulons which correspond to different inferred-signature samples (Figure 4B; Supplementary Figure 6C). For each eRegulon cluster, we used the associated target genes as input for gene ontology over representation analysis (ORA), to assess functional differences in the eGRN. This revealed fundamental differences in the underlying biological processes (Figure 5C; Supplementary Figure 7A). The AML sample with inferred-prenatal origin was enriched for many processes associated with development. In contrast, inferred-postnatal samples appeared more metabolism focused (Figure 5C; Supplementary Figure 7A). This further supports the association of these eRegulons with presumed prenatal origin t(8;21) AML, compared to postnatal origin disease.

Previous analysis using the TARGET ^[2] and BeatAML ^[23] datasets indicated that inferred-prenatal and inferred-postnatal origin t(8;21) AML may harbor different levels of chemosensitivity based on published drug response signatures (Figure 4D). Therefore, we performed in silico perturbations of eRegulon-associated TFs. PCA of the diagnosis and relapse samples recapitulated the expected differentiation trajectories along PC2, while separating diagnosis from relapse along PC1 (Figure 5D). Using the SCENIC+ ^[25] perturbation simulation workflow, we identified TFs estimated to induce differentiation, as defined by a negative shift in PC2 (Supplementary Figure 7B). We prioritized TFs predicted to impact the HSC compartment and identified 18 TFs with predicted significant effects on HSC differentiation (Supplementary Figure 7C). Several of these are components of the AP-1 complex (JUN, ATF4, FOSL2), which are established downstream targets of the t(8;21) fusion protein and are known to propagate t(8;21) AML ^{[55, 67]} (Figure 5E; Supplementary Figure 7C).

Using AP-1 complex members as a comparative baseline, we identified EP300 as one of the most impactful hits. EP300 has recently been shown to drive t(8;21) AML self-renewal through acetylation dependent mechanism ^[68]. This suggests that presumed prenatal origin pediatric t(8;21) AML may be particularly sensitive to EP300 inhibition. One of the most striking predictions, for both diagnostic and relapse HSC populations, was BCLAF1 (Figure 5E; Supplementary Figure 7C). BCLAF1 is a regulator of normal HSPCs ^[69], and its expression level declines during hematopoietic differentiation. While recent studies have identified a role for BCLAF1 in AML ^[70], this has not been explored in detail in the context of pediatric AML or t(8;21) AML and may present a therapeutic opportunity.

We also performed SCENIC+ ^[25] perturbation modelling on the postnatal origin sample (AML12). In this case, the PCA was less straightforward to interpret, as branching differentiation trajectories towards a lymphoid or myeloid fate appear along the PC2 axis, while PC1 distinguishes diagnosis and relapse samples (Supplementary Figure 7D). Therefore, we prioritized TFs based on a predicted effect similar to AP-1 complex components, as it is known that this complex is a critical regulator in t(8;21) AML. We identified several TFs from our original postnatal origin signature were predicted to have an effect (Supplementary Figures 7D-F), supporting the relevance of the GRNs identified in our previous analyses.

To further investigate EP300 and BCLAF1, we queried the DepMap ^[71] database to assess the dependency of t(8;21) AML cell lines to these genes (Supplementary Figure 7G). We found that the two widely used cell lines of t(8;21) AML, KASUMI-1 (7-year-old donor) ^[72] and SKNO-1 (22-year-old donor) ^[73], were among the most sensitive to these perturbations based on their DepMap effect scores (Supplementary Figure 7G). Several other cell lines sensitive to BCLAF1 were derived from pediatric cancers, most notably neuroblastomas, which also arise in-utero ^[74] (Supplementary Figure 7H). Together, these findings suggest that EP300 inhibition may be particularly effective in t(8;21) AML, and that BCLAF1 may present a new therapeutic target for t(8;21) AML, particularly in pediatric cases with inferred pre-natal origin of the driver translocation.

Discussion

Here we have generated a new data resource, AML scAtlas, to investigate AML biology across a broad range of subtypes at single-cell resolution. By including 222 samples comprising 748,679 cells of patients with a wide range of clinical characteristics, AML scAtlas overcomes the limitations of many standalone single-cell studies enabling AML subtype-focused analysis with enough data for robust statistical comparisons. This dataset is publicly available (https://cellxgene.bmh.manchester.ac.uk/AML/) providing the AML research community with a resource to address diverse biological questions and generate new hypotheses.

To further address a clinically relevant question using this data source, we compared differences between pediatric and adult-onset disease based on the potential biological effect of the in-utero origin of pediatric leukemia. Data of our AML scAtlas was used to explore the GRNs in adult and pediatric t(8;21) AML and revealed a strong age-associated GRN signature. This suggests that while pediatric and adult t(8;21) AMLs are propagated by the same driver translocation, they exhibit clear biological differences correlated with age. This may be due to differences in the cell-of-origin, with mouse models showing that t(8;21) AML can arise from a HSC or a more lineage restricted GMP ^[75]. As pediatric t(8;21) can arise in-utero, as evidenced by previous studies ^[17], and adult t(8;21) is acquired postnatally ^{[18, 19]}, we propose that the observed age-related differences in AML with t(8;21) reflect these differences in the developmental origins of the disease. We identified two distinct groups of regulons corresponding to either inferred-prenatal origin and inferred-postnatal origin disease. These regulons constitute the GRN underlying the cellular state, which can be informative when identifying molecular vulnerabilities to target leukemia.

Our cohort is the largest scRNA-seq dataset to explore t(8;21) AML biology to date, however, the number of patients included remains low (n=22), and many of the studies containing the adult samples used CD34 selection in their experimental protocol creating a bias towards HSPCs in these samples. To overcome some of these limitations, we used bulk RNA-seq samples from the TARGET ^[2] and BeatAML ^[23] studies with t(8;21) AML (n=83) to validate our regulon signatures. This identifies two clusters of samples which closely match these signatures, showing that the regulon patterns identified from our AML scAtlas are recapitulated with bulk RNA-seq data enabling exploration of larger patient cohorts. Comparisons between inferred-prenatal and inferred-postnatal origin transcriptomes prioritized TFs which were differentially expressed and highlighted differences in underlying biology and drug response. We identified 5 signature TFs (KDM5A, REST, BCLAF1, YY1, RAD21) for inferred-prenatal origin disease, several of which have roles in embryonic stem cells ^{[76, 77]}, and critical functions in the maintenance of HSCs ^[49-51]. In contrast, TFs identified in inferred-postnatal origin samples, such as interferon regulatory factors (IRFs), HDAC2, and SPI1, reflect inflammatory and immune processes, many of which have been implicated in leukemia ^[52-54]. We also found that inferred-prenatal origin samples had a higher proportion of HSC/Prog cell types compared to inferred-postnatal origin samples, a more primitive state than postnatal onset t(8;21) AML cases, supporting the hypothesis that age-associated differences in the cell-of-origin influence disease biology.

Given these biological differences, we used bulk RNA-seq to predict chemosensitivity using published drug response signatures ^[62-65]. Inferred-prenatal samples were enriched for genes indicative of cytarabine sensitivity and depleted of genes suggestive of daunorubicin and venetoclax resistance. These findings suggest that the developmental origins of the disease may influence drug responses, with potential implications in the design of novel therapeutic strategies and providing further biological evidence that pediatric AML might benefit from different clinical management compared with adult-onset AML. Importantly, venetoclax is currently in the AML23 trial (NCT05955261); our results support further evaluation of venetoclax treatment in pediatric t(8;21) AML.

Using an additional single-cell multiomic dataset, using SCENIC+, we reconstructed the eGRN in samples matching our inferred-prenatal and inferred-postnatal regulon signatures. Upon comparing eRegulons for each patient at diagnosis and relapse, we identified clusters of highly correlated eRegulons defined by different biological processes. Inferred-prenatal origin samples are characterized by developmental and transcriptional dysregulation, whereas inferred-postnatal origin samples are largely driven by fundamental cellular processes linked to inflammation. We used SCENIC+ to model the predicted impact of TF perturbations on our prenatal origin sample at diagnosis and relapse identified several key components of the AP-1 complex, which are critical in t(8;21) AML biology and are also associated with dynamic age-related transcriptional changes ^[55-57].

Through our analysis, we identified EP300 as a candidate target, which has been shown to be critical for t(8;21) AML biology ^[68] with demonstrable effects in KASUMI-1 and SKNO1 cell lines. EP300 has been identified as a promising therapeutic target in AML with several molecules in development ^[78]; our data indicate potential specific therapeutic benefit in prenatal origin t(8;21) AML. One of the most impactful perturbation predictions for the HSC compartment at diagnosis and relapse was BCLAF1. This is consistent with previous evidence of its importance in HSCs ^{[69, 79]} and AML ^[70], but has not been studied specifically in the context of pediatric AML or t(8;21) AML previously. The DepMap data shows that KASUMI-1 is the most sensitive myeloid cell line to BCLAF1 perturbation, and our GRN analyses suggest it is particularly active in pediatric t(8;21) AML of inferred in-utero origin, thus this may represent an additional prognostic indicator.

Further investigations are required to characterize the roles of both EP300 and BCLAF1 in prenatal origin t(8;21) AML before any clinical realisation. EP300 has already been investigated as a target in AML, so future work should focus on the pediatric AML setting with in-vitro and in-vivo studies using EP300/CBP inhibitors such as inobrodib ^[78]. In contrast

BCLAF1 is relatively unexplored, and additional work is required to elucidate its molecular function and assess its potential as a therapeutic target. BCLAF1 may ultimately prove most valuable as a biomarker of in-utero t(8;21) AML, enabling distinction between late-onset in-utero and postnatal disease. This would require molecular validation in a large cohort of pediatric patients with Gunthrie spots to confirm whether they had acquired t(8;21) in-utero.

Conclusions

Overall, our study demonstrates that large-scale single-cell data integration is a powerful approach to dissect specific patient groups in detail, and enabling robust comparative analyses. We present the AML scAtlas as a publicly available resource for the research community to address diverse biological questions. By applying AML scAtlas to t(8;21) AML, we identified age-associated gene regulatory networks that likely reflect differences in the developmental origins, biology and outcome of the disease. These findings also highlight novel candidate therapeutic targets which may be more relevant in pediatric t(8;21) AML compared to adult-onset disease, offering opportunities for more tailored treatment strategies.

Methods

For the complete analysis code, including the conda environments used for analysis, see GitHub Repo (https://github.com/jesswhitts/AML-scAtlas).

Data Collection

A literature search was performed for published AML scRNA-seq datasets ^{[4-6, 8, 26-40]}. Suitable studies were selected based on the data quality (over 1000 counts and 500 genes detected per cell for most of the data). Diagnostic, primary AML samples were selected from each AML study. Where healthy donor samples were present, these were also included, along with an additional 4 studies with healthy bone marrow samples.

Initial Data Processing

Each scRNA-seq dataset underwent initial quality control individually using Scanpy (v1.9.3) ^[80] as some studies provided raw data and others provided pre-filtered data. Where raw data was provided, doublets were removed using Scrublet (v0.2.3) ^[81] and cells were filtered using the median absolute deviation as described in this single-cell best practices handbook ^{[10, 82]}.

Once filtered, datasets were combined, and quality control was performed using Scanpy (v1.9.3) ^[80]. The full dataset had quality thresholds applied (percentage mitochondrial counts <10, read counts >1000, gene counts >500), removing any samples which had fewer than 50 cells remaining after filtering. Genes present in <50 cells were removed. MATAL1 was removed as this was highly abundant in many cells and considered artefactual.

Batch Correction

The presence of batch effects was determined through dimensionality reduction and clustering using Scanpy (v1.9.3) ^[80] and using the kBET algorithm (v0.99.6) ^[83]. This was repeated on individual studies, to assess whether there were sample-wise batch effects. Batch correction benchmarking was implemented using Harmony (Scanpy v1.9.3 implementation) ^[11], scVI (v1.0.3) ^[12], and scANVI (v1.0.3) ^[13] and quantified using scIB (1.1.4) ^[9]. Different numbers of highly variable genes were used to select the optimal number for integration. Batch correction was performed using scVI (v1.0.3) ^[12] with the top 2000 highly variable genes, using sample as the model covariate.

AML scAtlas Cell Type Annotation

The scVI corrected embedding was used to run UMAP and Leiden clustering using Scanpy functions (v1.9.3) ^[80]. Cell type annotation was performed using CellTypist (v1.6.0) ^[84] using the ‘Immune_All_Low.pkl’ model, SingleR (v2.0.0) ^[85] using the Novershtern hematopoietic refreference^[86], and scType (v1.0) ^[87] with the tissue defined as ‘Immune system’. Full automated tool outputs are detailed in Supplementary Table 3; overall we found that the results varied significantly between different tools. We postulate that this is, in part, due to differences in the reference profiles used. Thus, we opted to use the best consensus of these different tools for our cluster identity assignments.

AML scAtlas LSC Annotation

HSPC clusters were selected from AML scAtlas, and the scVI corrected embedding was used to re-compute UMAP using Scanpy functions (v1.9.3) ^[80]. As our previous cell type annotations used generic reference profiles and were not AML specific, we generated a custom cell type annotation reference to identify LSCs. We created a custom SingleR (v2.0.0) ^[85] reference using the Zeng et al ^[7] revised annotations of the Van Galen et al ^[5] dataset (Supplementary Table 3). This was also correlated with the LSC6 ^[43] and LSC17 ^[44] scores for each cell. To compare LSC abundance between ELN risk groups, chi2_contingency was implemented from SciPy (v1.12.0).

AML with t(8;21) Analysis

Samples with the t(8;21) translocation were selected from the full AML scAtlas. The UMAP was re-computed, and genes were filtered to remove those detected in fewer than 50 cells for the revised dataset, leaving 24,866 genes remaining. Gene regulatory network analysis was performed using pySCENIC ^{[21, 22]} (v0.12.1) as per the recommended workflow. To facilitate comparisons between age groups, cell types were focused on HSPCs, as many adult samples were originally enriched for CD34. The RSS was calculated for the adult and pediatric samples to select the top 20 differential regulons per age group. Using SciPy hierarchical clustering (v1.12.0), regulons were filtered to identify regulon signature groups used for downstream analysis.

Bulk RNA-Seq Analysis

Bulk RNA-Seq data was downloaded for the TARGET ^[2] and BeatAML ^[23] cohorts and samples with t(8;21) were selected. Only bone marrow samples taken at diagnosis were used for downstream analyses (Supplementary Table 4). Using the previously defined signature regulons, the AUCell algorithm ^[21] (v0.12.1) was implemented to measure regulon activity. Hierarchical clustering was performed using SciPy v1.12.0) to identify samples most enriched for each age-related signature. Differential gene expression analysis was implemented using edgeR ^[58] (v3.42.4) and DESeq2 ^[59] (v1.40.2), using a log2 fold change threshold of 0.5 and an adjusted p value cutoff of 0.01. Candidate differential genes, ranked on log2 fold change, underwent GSEA with GSEApy (v0.10.8) using a significance threshold of 0.05. Cell type deconvolution was performed using AutoGeneS ^[66] (v1.0.4) using the recommended workflow. Significance when comparing groups was ascertained using a student’s T-test on the predicted cell type proportion values for each sample.

Single-Cell Multi-Omics Analysis

The Lambo et al ^[24] scRNA-seq and scATAC-seq data from pediatric AML bone marrow samples was downloaded, and the t(8;21) samples selected (Supplementary Table 4). Using our previously defined signature regulons, the AUCell algorithm ^[21] (v0.12.1) was implemented to measure regulon activity. This identified the samples most enriched for each age-related signature as AML16 and AML12.

The SCENIC+ pipeline ^[25] (v1.0a1) was implemented as per the recommended Snakemake workflow for creating pseudo-multiome data. Regulons were filtered by correlation between modalities, using a threshold of 0.2 for non-multiome data. The most robust regulons were prioritized based on the SCENIC+ recommendations (direct +/+) ^[25]. To facilitate comparisons, the eRegulon RSS was calculated for each patient and the top 30 eRegulons selected. The correlation between the gene sets underpinning eRegulons was calculated and sample-associated clusters were selected for over-representation analysis with clusterProfiler ^[88] (v4.8.3).

To predict the impact of specific TF perturbations on key cell types, SCENIC+ ^[25] perturbation modelling was implemented using the recommended parameters. TFs were then prioritized on their predicted impact on HSC differentiation and visualised using the PCA embedding. Candidate targets EP300 and BCLAF1 were queried in the DepMap ^[71] databases to infer their potential importance.

Acknowledgements

JW was funded by MRC DTP award (MR/W007428/1), SM by Blood cancer UK (15038) and CCLG (2016 09), while GL by Cancer Research UK (C5759/A20971 & C5759/A27412) and MI by MRC (MR/X014088/1).

We would like to thank all authors of the public data used in this study for their contributions to scientific community. We also acknowledge useful discussions around this work with Magnus Rattray.

Additional information

Data availability

The AML scAtlas is hosted online for public use (https://cellxgene.bmh.manchester.ac.uk/AML/). The processed AnnData object is also available to download from figshare (DOI: 10.48420/27269946). Details of all data used in this study can be found in Supplementary Table 1, along with associated links to the original data. All code used to perform the analyses presented here can be accessed in the GitHub repository: (https://github.com/jesswhitts/AML-scAtlas). The samples used for validation analyses are publicly available and are detailed in Supplementary Table 4. The SCENIC+ eGRN files are provided as Supplementary Table 5.

Author Contributions

MI, SMB, and GL conceived the study and oversaw the research. JW did the data collection and implemented the analyses. SM guided sub-analyses and assisted with interpretation. JW wrote the first draft of the manuscript. All authors interpreted the data and edited the manuscript. All authors approved the final manuscript.

Funding

Medical Research Council (MR/W007428/1)

Medical Research Council (MR/X014088/1)

Cancer Research UK (C5759/A20971)

Cancer Research UK (C5759/A27412)

Blood Cancer UK (15038)

Additional files

Supplementary figures and tables

Supplementary Table 3

Supplementary Table 4

Supplementary Table 5

References

1.
1. Tenen DG
2003Disruption of differentiation in human cancer: AML shows the wayNature Reviews Cancer 3:89–101Google Scholar
2.
1. Bolouri H
2. Farrar JE
3. Triche T
4. Ries RE
5. Lim EL
6. Alonzo TA
7. Ma Y
8. Moore R
9. Mungall AJ
10. Marra MA
11. Zhang J
12. Ma X
13. Liu Y
14. Liu Y
15. Auvil JMG
16. Davidsen TM
17. Gesuwan P
18. Hermida LC
19. Salhia B
20. Capone S
21. Ramsingh G
22. Zwaan CM
23. Noort S
24. Piccolo SR
25. Kolb EA
26. Gamis AS
27. Smith MA
28. Gerhard DS
29. Meshinchi S.
2018The molecular landscape of pediatric acute myeloid leukemia reveals recurrent structural alterations and age-specific mutational interactionsNat Med 24:103–12Google Scholar
3.
1. Velten L
2. Haas SF
3. Raffel S
4. Blaszkiewicz S
5. Islam S
6. Hennig BP
7. Hirche C
8. Lutz C
9. Buss EC
10. Nowak D
11. Boch T
12. Hofmann WK
13. Ho AD
14. Huber W
15. Trumpp A
16. Essers MA
17. Steinmetz LM
2017Human haematopoietic stem cell lineage commitment is a continuous processNat Cell Biol 19:271–81Google Scholar
4.
1. Velten L
2. Story BA
3. Hernández-Malmierca P
4. Raffel S
5. Leonce DR
6. Milbank J
7. Paulsen M
8. Demir A
9. Szu-Tu C
10. Frömel R
11. Lutz C
12. Nowak D
13. Jann J-C
14. Pabst C
15. Boch T
16. Hofmann W-K
17. Müller-Tidow C
18. Trumpp A
19. Haas S
20. Steinmetz LM
2021Identification of leukemic and pre-leukemic stem cells by clonal tracking from single-cell transcriptomicsNature Communications 12:1366Google Scholar
5.
1. van Galen P
2. Hovestadt V
3. Wadsworth Ii MH
4. Hughes TK
5. Griffin GK
6. Battaglia S
7. Verga JA
8. Stephansky J
9. Pastika TJ
10. Lombardi Story J
11. Pinkus GS
12. Pozdnyakova O
13. Galinsky I
14. Stone RM
15. Graubert TA
16. Shalek AK
17. Aster JC
18. Lane AA
19. Bernstein BE
2019Single-Cell RNA-Seq Reveals AML Hierarchies Relevant to Disease Progression and ImmunityCell 176:1265–81Google Scholar
6.
1. Beneyto-Calabuig S
2. Merbach AK
3. Kniffka JA
4. Antes M
5. Szu-Tu C
6. Rohde C
7. Waclawiczek A
8. Stelmach P
9. Gräßle S
10. Pervan P
11. Janssen M
12. Landry JJM
13. Benes V
14. Jauch A
15. Brough M
16. Bauer M
17. Besenbeck B
18. Felden J
19. Bäumer S
20. Hundemer M
21. Sauer T
22. Pabst C
23. Wickenhauser C
24. Angenendt L
25. Schliemann C
26. Trumpp A
27. Haas S
28. Scherer M
29. Raffel S
30. Müller-Tidow C
31. Velten L.
2023Clonally resolved single-cell multi-omics identifies routes of cellular differentiation in acute myeloid leukemiaCell Stem Cell 30:706–21Google Scholar
7.
1. Zeng AGX
2. Bansal S
3. Jin L
4. Mitchell A
5. Chen WC
6. Abbas HA
7. Chan-Seng-Yue M
8. Voisin V
9. van Galen P
10. Tierens A
11. Cheok M
12. Preudhomme C
13. Dombret H
14. Daver N
15. Futreal PA
16. Minden MD
17. Kennedy JA
18. Wang JCY
19. Dick JE
2022A cellular hierarchy framework for understanding heterogeneity and predicting drug response in acute myeloid leukemiaNature Medicine 28:1212–23Google Scholar
8.
1. Stetson LC
2. Balasubramanian D
3. Ribeiro SP
4. Stefan T
5. Gupta K
6. Xu X
7. Fourati S
8. Roe A
9. Jackson Z
10. Schauner R
11. Sharma A
12. Tamilselvan B
13. Li S
14. de Lima M
15. Hwang TH
16. Balderas R
17. Saunthararajah Y
18. Maciejewski J
19. LaFramboise T
20. Barnholtz-Sloan JS
21. Sekaly RP
22. Wald DN
2021Single cell RNA sequencing of AML initiating cells reveals RNA-based evolution during disease progressionLeukemia 35:2799–812Google Scholar
9.
1. Luecken MD
2. Büttner M
3. Chaichoompu K
4. Danese A
5. Interlandi M
6. Mueller MF
7. Strobl DC
8. Zappia L
9. Dugas M
10. Colomé-Tatché M
11. Theis FJ
2022Benchmarking atlas-level data integration in single-cell genomicsNature Methods 19:41–50Google Scholar
10.
1. Heumos L
2. Schaar AC
3. Lance C
4. Litinetskaya A
5. Drost F
6. Zappia L
7. Lücken MD
8. Strobl DC
9. Henao J
10. Curion F
11. Single-cell Best Practices Consortium
12. Schiller HB
13. Theis FJ
2023Best practices for single-cell analysis across modalitiesNature Reviews Genetics 24:550–572Google Scholar
11.
1. Korsunsky I
2. Millard N
3. Fan J
4. Slowikowski K
5. Zhang F
6. Wei K
7. Baglaenko Y
8. Brenner M
9. P-r Loh
10. Raychaudhuri S.
2019Fast, sensitive and accurate integration of single-cell data with HarmonyNature Methods 16:1289–96Google Scholar
12.
1. Lopez R
2. Regier J
3. Cole MB
4. Jordan MI
5. Yosef N.
2018Deep generative modeling for single-cell transcriptomicsNature Methods 15:1053–8Google Scholar
13.
1. Xu C
2. Lopez R
3. Mehlman E
4. Regier J
5. Jordan MI
6. Yosef N.
2021Probabilistic harmonization and annotation of single-cell transcriptomics data with deep generative modelsMolecular Systems Biology 17:e9620Google Scholar
14.
1. Balgobind BV
2. Van den Heuvel-Eibrink MM
3. De Menezes RX
4. Reinhardt D
5. Hollink IH
6. Arentsen-Peters ST
7. van Wering ER
8. Kaspers GJ
9. Cloos J
10. de Bont ES
11. Cayuela JM
12. Baruchel A
13. Meyer C
14. Marschalek R
15. Trka J
16. Stary J
17. Beverloo HB
18. Pieters R
19. Zwaan CM
20. den Boer ML
2011Evaluation of gene expression signatures predictive of cytogenetic and molecular subtypes of pediatric acute myeloid leukemiaHaematologica 96:221–30Google Scholar
15.
1. Wiggers CRM
2. Baak ML
3. Sonneveld E
4. Nieuwenhuis EES
5. Bartels M
6. Creyghton MP
2019AML Subtype Is a Major Determinant of the Association between Prognostic Gene Expression Signatures and Their Clinical SignificanceCell Reports 28:2866–77Google Scholar
16.
1. Chaudhury S
2. O’Connor C
3. Cañete A
4. Bittencourt-Silvestre J
5. Sarrou E
6. Choi J
7. Johnston P
8. Wells CA
9. Gibson B
10. Keeshan K.
2018Age-specific biological and molecular profiling distinguishes paediatric from adult acute myeloid leukaemiasNat Commun 9:5280Google Scholar
17.
1. Wiemels JL
2. Xiao Z
3. Buffler PA
4. Maia AT
5. Ma X
6. Dicks BM
7. Smith MT
8. Zhang L
9. Feusner J
10. Wiencke J
11. Pritchard-Jones K
12. Kempski H
13. Greaves M.
2002In utero origin of t(8;21) AML1-ETO translocations in childhood acute myeloid leukemiaBlood 99:3801–5Google Scholar
18.
1. Welch JS
2. Ley TJ
3. Link DC
4. Miller CA
5. Larson DE
6. Koboldt DC
7. Wartman LD
8. Lamprecht TL
9. Liu F
10. Xia J
11. Kandoth C
12. Fulton RS
13. McLellan MD
14. Dooling DJ
15. Wallis JW
16. Chen K
17. Harris CC
18. Schmidt HK
19. Kalicki-Veizer JM
20. Lu C
21. Zhang Q
22. Lin L
23. O’Laughlin MD
24. McMichael JF
25. Delehaunty KD
26. Fulton LA
27. Magrini VJ
28. McGrath SD
29. Demeter RT
30. Vickery TL
31. Hundal J
32. Cook LL
33. Swift GW
34. Reed JP
35. Alldredge PA
36. Wylie TN
37. Walker JR
38. Watson MA
39. Heath SE
40. Shannon WD
41. Varghese N
42. Nagarajan R
43. Payton JE
44. Baty JD
45. Kulkarni S
46. Klco JM
47. Tomasson MH
48. Westervelt P
49. Walter MJ
50. Graubert TA
51. DiPersio JF
52. Ding L
53. Mardis ER
54. Wilson RK
2012The origin and evolution of mutations in acute myeloid leukemiaCell 150:264–78Google Scholar
19.
1. Jaiswal S
2. Fontanillas P
3. Flannick J
4. Manning A
5. Grauman PV
6. Mar BG
7. Lindsley RC
8. Mermel CH
9. Burtt N
10. Chavez A
11. Higgins JM
12. Moltchanov V
13. Kuo FC
14. Kluk MJ
15. Henderson B
16. Kinnunen L
17. Koistinen HA
18. Ladenvall C
19. Getz G
20. Correa A
21. Banahan BF
22. Gabriel S
23. Kathiresan S
24. Stringham HM
25. McCarthy MI
26. Boehnke M
27. Tuomilehto J
28. Haiman C
29. Groop L
30. Atzmon G
31. Wilson JG
32. Neuberg D
33. Altshuler D
34. Ebert BL
2014Age-related clonal hematopoiesis associated with adverse outcomesN Engl J Med 371:2488–98Google Scholar
20.
1. National Cancer Registration and Analysis Service, Northern Ireland Cancer Registry, Scottish Cancer Registry, Unit WCIaS
2021Children, teenagers and young adults UK cancer statistics report 2021http://www.ncin.org.uk/cancer_type_and_topic_specific_work/cancer_type_specific_work/cancer_in_children_teenagers_and_young_adults/
21.
1. Aibar S
2. Cb González-Blas
3. Moerman T
4. Huynh-Thu VA
5. Imrichova H
6. Hulselmans G
7. Rambow F
8. Marine J-C
9. Geurts P
10. Aerts J
11. van den Oord J
12. Atak ZK
13. Wouters J
14. Aerts S.
2017SCENIC: single-cell regulatory network inference and clusteringNature Methods 14:1083–6Google Scholar
22.
1. Van de Sande B
2. Flerin C
3. Davie K
4. De Waegeneer M
5. Hulselmans G
6. Aibar S
7. Seurinck R
8. Saelens W
9. Cannoodt R
10. Rouchon Q
11. Verbeiren T
12. De Maeyer D
13. Reumers J
14. Saeys Y
15. Aerts S.
2020A scalable SCENIC workflow for single-cell gene regulatory network analysisNature Protocols 15:2247–76Google Scholar
23.
1. Burd A
2. Levine RL
3. Ruppert AS
4. Mims AS
5. Borate U
6. Stein EM
7. Patel P
8. Baer MR
9. Stock W
10. Deininger M
11. Blum W
12. Schiller G
13. Olin R
14. Litzow M
15. Foran J
16. Lin TL
17. Ball B
18. Boyiadzis M
19. Traer E
20. Odenike O
21. Arellano M
22. Walker A
23. Duong VH
24. Kovacsovics T
25. Collins R
26. Shoben AB
27. Heerema NA
28. Foster MC
29. Vergilio J-A
30. Brennan T
31. Vietz C
32. Severson E
33. Miller M
34. Rosenberg L
35. Marcus S
36. Yocum A
37. Chen T
38. Stefanos M
39. Druker B
40. Byrd JC
2020Precision medicine treatment in acute myeloid leukemia using prospective genomic profiling: feasibility and preliminary efficacy of the Beat AML Master TrialNature Medicine 26:1852–8Google Scholar
24.
1. Lambo S
2. Trinh DL
3. Ries RE
4. Jin D
5. Setiadi A
6. Ng M
7. Leblanc VG
8. Loken MR
9. Brodersen LE
10. Dai F
11. Pardo LM
12. Ma X
13. Vercauteren SM
14. Meshinchi S
15. Marra MA
2023A longitudinal single-cell atlas of treatment response in pediatric AMLCancer Cell 41:2117–2135Google Scholar
25.
1. Bravo González-Blas C
2. De Winter S
3. Hulselmans G
4. Hecker N
5. Matetovici I
6. Christiaens V
7. Poovathingal S
8. Wouters J
9. Aibar S
10. Aerts S.
2023SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networksNature Methods 20:1355–67Google Scholar
26.
1. Zheng GXY
2. Terry JM
3. Belgrader P
4. Ryvkin P
5. Bent ZW
6. Wilson R
7. Ziraldo SB
8. Wheeler TD
9. McDermott GP
10. Zhu J
11. Gregory MT
12. Shuga J
13. Montesclaros L
14. Underwood JG
15. Masquelier DA
16. Nishimura SY
17. Schnall-Levin M
18. Wyatt PW
19. Hindson CM
20. Bharadwaj R
21. Wong A
22. Ness KD
23. Beppu LW
24. Deeg HJ
25. McFarland C
26. Loeb KR
27. Valente WJ
28. Ericson NG
29. Stevens EA
30. Radich JP
31. Mikkelsen TS
32. Hindson BJ
33. Bielas JH
2017Massively parallel digital transcriptional profiling of single cellsNature Communications 8:14049Google Scholar
27.
1. Petti AA
2. Williams SR
3. Miller CA
4. Fiddes IT
5. Srivatsan SN
6. Chen DY
7. Fronick CC
8. Fulton RS
9. Church DM
10. Ley TJ
2019A general approach for detecting expressed mutations in AML cells using single cell RNA-sequencingNat Commun 10:3660Google Scholar
28.
1. Jiang L
2. Li XP
3. Dai YT
4. Chen B
5. Weng XQ
6. Xiong SM
7. Zhang M
8. Huang JY
9. Chen Z
10. Chen SJ
2020Multidimensional study of the heterogeneity of leukemia cells in t(8;21) acute myelogenous leukemia identifies the subtype with poor outcomeProc Natl Acad Sci U S A 117:20117–26Google Scholar
29.
1. Johnston G
2. Ramsey HE
3. Liu Q
4. Wang J
5. Stengel KR
6. Sampathi S
7. Acharya P
8. Arrate M
9. Stubbs MC
10. Burn T
11. Savona MR
12. Hiebert SW
2020Nascent transcript and single-cell RNA-seq analysis defines the mechanism of action of the LSD1 inhibitor INCB059872 in myeloid leukemiaGene 752:144758Google Scholar
30.
1. Pei S
2. Pollyea DA
3. Gustafson A
4. Stevens BM
5. Minhajuddin M
6. Fu R
7. Riemondy KA
8. Gillen AE
9. Sheridan RM
10. Kim J
11. Costello JC
12. Amaya ML
13. Inguva A
14. Winters A
15. Ye H
16. Krug A
17. Jones CL
18. Adane B
19. Khan N
20. Ponder J
21. Schowinsky J
22. Abbott D
23. Hammes A
24. Myers JR
25. Ashton JM
26. Nemkov T
27. D’Alessandro A
28. Gutman JA
29. Ramsey HE
30. Savona MR
31. Smith CA
32. Jordan CT
2020Monocytic Subclones Confer Resistance to Venetoclax-Based Therapy in Patients with Acute Myeloid LeukemiaCancer Discov 10:536–51Google Scholar
31.
1. Li K
2. Du Y
3. Cai Y
4. Liu W
5. Lv Y
6. Huang B
7. Zhang L
8. Wang Z
9. Liu P
10. Sun Q
11. Li N
12. Zhu M
13. Bosco B
14. Li L
15. Wu W
16. Wu L
17. Li J
18. Wang Q
19. Hong M
20. Qian S.
2023Single-cell analysis reveals the chemotherapy-induced cellular reprogramming and novel therapeutic targets in relapsed/refractory acute myeloid leukemiaLeukemia 37:308–25Google Scholar
32.
1. Lasry A
2. Nadorp B
3. Fornerod M
4. Nicolet D
5. Wu H
6. Walker CJ
7. Sun Z
8. Witkowski MT
9. Tikhonova AN
10. Guillamot-Ruano M
11. Cayanan G
12. Yeaton A
13. Robbins G
14. Obeng EA
15. Tsirigos A
16. Stone RM
17. Byrd JC
18. Pounds S
19. Carroll WL
20. Gruber TA
21. Eisfeld A-K
22. Aifantis I.
2023An inflammatory state remodels the immune microenvironment and improves risk stratification in acute myeloid leukemiaNature Cancer 4:27–42Google Scholar
33.
1. Fiskus W
2. Mill CP
3. Birdwell C
4. Davis JA
5. Das K
6. Boettcher S
7. Kadia TM
8. DiNardo CD
9. Takahashi K
10. Loghavi S
11. Soth MJ
12. Heffernan T
13. McGeehan GM
14. Ruan X
15. Su X
16. Vakoc CR
17. Daver N
18. Bhalla KN
2023Targeting of epigenetic co-dependencies enhances anti-AML efficacy of Menin inhibitor in AML with MLL1-r or mutant NPM1Blood Cancer Journal 13:53Google Scholar
34.
1. Naldini MM
2. Casirati G
3. Barcella M
4. Rancoita PMV
5. Cosentino A
6. Caserta C
7. Pavesi F
8. Zonari E
9. Desantis G
10. Gilioli D
11. Carrabba MG
12. Vago L
13. Bernardi M
14. Di Micco R
15. Di Serio C
16. Merelli I
17. Volpin M
18. Montini E
19. Ciceri F
20. Gentner B.
2023Longitudinal single-cell profiling of chemotherapy response in acute myeloid leukemiaNature Communications 14:1285Google Scholar
35.
1. Mumme H
2. Thomas BE
3. Bhasin SS
4. Krishnan U
5. Dwivedi B
6. Perumalla P
7. Sarkar D
8. Ulukaya GB
9. Sabnis HS
10. Park SI
11. DeRyckere D
12. Raikar SS
13. Pauly M
14. Summers RJ
15. Castellino SM
16. Wechsler DS
17. Porter CC
18. Graham DK
19. Bhasin M.
2023Single-cell analysis reveals altered tumor microenvironments of relapse- and remission-associated pediatric acute myeloid leukemiaNature Communications 14:6209Google Scholar
36.
1. Zhang Y
2. Jiang S
3. He F
4. Tian Y
5. Hu H
6. Gao L
7. Zhang L
8. Chen A
9. Hu Y
10. Fan L
11. Yang C
12. Zhou B
13. Liu D
14. Zhou Z
15. Su Y
16. Qin L
17. Wang Y
18. He H
19. Lu J
20. Xiao P
21. Hu S
22. Wang Q-F.
2023Single-cell transcriptomics reveals multiple chemoresistant properties in leukemic stem and progenitor cells in pediatric AMLGenome Biology 24:199Google Scholar
37.
1. Li B
2. Kowalczyk MS
3. Slyper M
4. Jellert G
5. Tabaka M
6. Ashenberg O
7. Waldman J
8. Dionne D
9. Abigail K
10. Hui M
11. Yang Y
12. Rozenblatt-Rosen O
13. Regev A.
2022A single cell immune cell atlas of human hematopoietic system Human Cell Atlas Data Portal: Human Cell Atlashttps://explore.data.humancellatlas.org/projects/cc95ff89-2e68-4a08-a234-480eca21ce79
38.
1. Oetjen KA
2. Lindblad KE
3. Goswami M
4. Gui G
5. Dagur PK
6. Lai C
7. Dillon LW
8. McCoy JP
9. Hourigan CS
2018Human bone marrow assessment by single-cell RNA sequencing, mass cytometry, and flow cytometryJCI Insight 3Google Scholar
39.
1. Setty M
2. Kiseliovas V
3. Levine J
4. Gayoso A
5. Mazutis L
6. Pe’er D.
2019Characterization of cell fate probabilities in single-cell data with PalantirNat Biotechnol 37:451–60Google Scholar
40.
1. Caron M
2. St-Onge P
3. Sontag T
4. Wang YC
5. Richer C
6. Ragoussis I
7. Sinnett D
8. Bourque G.
2020Single-cell analysis of childhood leukemia reveals a link between developmental states and ribosomal protein expression as a source of intra-individual heterogeneityScientific Reports 10:8079Google Scholar
41.
1. Döhner H
2. Wei AH
3. Appelbaum FR
4. Craddock C
5. DiNardo CD
6. Dombret H
7. Ebert BL
8. Fenaux P
9. Godley LA
10. Hasserjian RP
11. Larson RA
12. Levine RL
13. Miyazaki Y
14. Niederwieser D
15. Ossenkoppele G
16. Röllig C
17. Sierra J
18. Stein EM
19. Tallman MS
20. Tien H-F
21. Wang J
22. Wierzbowska A
23. Löwenberg B.
2022Diagnosis and management of AML in adults: 2022 recommendations from an international expert panel on behalf of the ELNBlood 140:1345–77Google Scholar
42.
1. Montefiori LE
2. Bendig S
3. Gu Z
4. Chen X
5. Pölönen P
6. Ma X
7. Murison A
8. Zeng A
9. Garcia-Prat L
10. Dickerson K
11. Iacobucci I
12. Abdelhamed S
13. Hiltenbrand R
14. Mead PE
15. Mehr CM
16. Xu B
17. Cheng Z
18. Chang TC
19. Westover T
20. Ma J
21. Stengel A
22. Kimura S
23. Qu C
24. Valentine MB
25. Rashkovan M
26. Luger S
27. Litzow MR
28. Rowe JM
29. den Boer ML
30. Wang V
31. Yin J
32. Kornblau SM
33. Hunger SP
34. Loh ML
35. Pui CH
36. Yang W
37. Crews KR
38. Roberts KG
39. Yang JJ
40. Relling MV
41. Evans WE
42. Stock W
43. Paietta EM
44. Ferrando AA
45. Zhang J
46. Kern W
47. Haferlach T
48. Wu G
49. Dick JE
50. Klco JM
51. Haferlach C
52. Mullighan CG
2021Enhancer Hijacking Drives Oncogenic BCL11B Expression in Lineage-Ambiguous Stem Cell LeukemiaCancer Discov 11:2846–67Google Scholar
43.
1. Elsayed AH
2. Rafiee R
3. Cao X
4. Raimondi S
5. Downing JR
6. Ribeiro R
7. Fan Y
8. Gruber TA
9. Baker S
10. Klco J
11. Rubnitz JE
12. Pounds S
13. Lamba JK
2020A six-gene leukemic stem cell score identifies high risk pediatric acute myeloid leukemiaLeukemia 34:735–45Google Scholar
44.
1. Ng SWK
2. Mitchell A
3. Kennedy JA
4. Chen WC
5. McLeod J
6. Ibrahimova N
7. Arruda A
8. Popescu A
9. Gupta V
10. Schimmer AD
11. Schuh AC
12. Yee KW
13. Bullinger L
14. Herold T
15. Görlich D
16. Büchner T
17. Hiddemann W
18. Berdel WE
19. Wörmann B
20. Cheok M
21. Preudhomme C
22. Dombret H
23. Metzeler K
24. Buske C
25. Löwenberg B
26. Valk PJM
27. Zandstra PW
28. Minden MD
29. Dick JE
30. Wang JCY
2016A 17-gene stemness score for rapid determination of risk in acute leukaemiaNature 540:433–7Google Scholar
45.
1. Hamed AA
2. Kunz DJ
3. El-Hamamy I
4. Trinh QM
5. Subedar OD
6. Richards LM
7. Foltz W
8. Bullivant G
9. Ware M
10. Vladoiu MC
11. Zhang J
12. Raj AM
13. Pugh TJ
14. Taylor MD
15. Teichmann SA
16. Stein LD
17. Simons BD
18. Dirks PB
2022A brain precursor atlas reveals the acquisition of developmental-like states in adult cerebral tumoursNat Commun 13:4178Google Scholar
46.
1. Barnett SN
2. Cujba AM
3. Yang L
4. Maceiras AR
5. Li S
6. Kedlian VR
7. Pett JP
8. Polanski K
9. Miranda AMA
10. Xu C
11. Cranley J
12. Kanemaru K
13. Lee M
14. Mach L
15. Perera S
16. Tudor C
17. Joseph PD
18. Pritchard S
19. Toscano-Rivalta R
20. Tuong ZK
21. Bolt L
22. Petryszak R
23. Prete M
24. Cakir B
25. Huseynov A
26. Sarropoulos I
27. Chowdhury RA
28. Elmentaite R
29. Madissoon E
30. Oliver AJ
31. Campos L
32. Brazovskaja A
33. Gomes T
34. Treutlein B
35. Kim CN
36. Nowakowski TJ
37. Meyer KB
38. Randi AM
39. Noseda M
40. Teichmann SA
2024An organotypic atlas of human vascular cellsNat Med 30:3468–3481Google Scholar
47.
1. Zhang B
2. He P
3. Lawrence JEG
4. Wang S
5. Tuck E
6. Williams BA
7. Roberts K
8. Kleshchevnikov V
9. Mamanova L
10. Bolt L
11. Polanski K
12. Li T
13. Elmentaite R
14. Fasouli ES
15. Prete M
16. He X
17. Yayon N
18. Fu Y
19. Yang H
20. Liang C
21. Zhang H
22. Blain R
23. Chedotal A
24. FitzPatrick DR
25. Firth H
26. Dean A
27. Bayraktar OA
28. Marioni JC
29. Barker RA
30. Storer MA
31. Wold BJ
32. Zhang H
33. Teichmann SA
2024A human embryonic limb cell atlas resolved in space and timeNature 635:668–678Google Scholar
48.
1. Nguyen H
2. Tran D
3. Tran B
4. Pehlivan B
5. Nguyen T.
2021A comprehensive survey of regulatory network inference methods using single cell RNA sequencing dataBrief Bioinform 22:bbaa190Google Scholar
49.
1. Lu Z
2. Hong CC
3. Kong G
4. Assumpção A
5. Ong IM
6. Bresnick EH
7. Zhang J
8. Pan X.
2018Polycomb Group Protein YY1 Is an Essential Regulator of Hematopoietic Stem Cell QuiescenceCell Rep 22:1545–59Google Scholar
50.
1. Fisher JB
2. Peterson J
3. Reimer M
4. Stelloh C
5. Pulakanti K
6. Gerbec ZJ
7. Abel AM
8. Strouse JM
9. Strouse C
10. McNulty M
11. Malarkannan S
12. Crispino JD
13. Milanovich S
14. Rao S.
2017The cohesin subunit Rad21 is a negative regulator of hematopoietic self-renewal through epigenetic repression of Hoxa7 and Hoxa9Leukemia 31:712–9Google Scholar
51.
1. Kumar P
2. Zhang N
3. Lee J
4. Cheng H
5. Kurtz K
6. Conneely SE
7. Sasidharan R
8. Rau RE
9. Pati D.
2023Cohesin Subunit RAD21 Regulates the Differentiation and Self-Renewal of Hematopoietic Stem and Progenitor CellsStem Cells 41:971–85Google Scholar
52.
1. Ning S
2. Pagano JS
3. Barber GN
2011IRF7: activation, regulation, modification and functionGenes & Immunity 12:399–414Google Scholar
53.
1. Fischer J
2. Walter C
3. Tönges A
4. Aleth H
5. Jordão MJC
6. Leddin M
7. Gröning V
8. Erdmann T
9. Lenz G
10. Roth J
11. Vogl T
12. Prinz M
13. Dugas M
14. Jacobsen ID
15. Rosenbauer F.
2019Safeguard function of PU.1 shapes the inflammatory epigenome of neutrophilsNature Immunology 20:546–58Google Scholar
54.
1. Fang W-F
2. Chen Y-M
3. Lin C-Y
4. Huang H-L
5. Yeh H
6. Chang Y-T
7. Huang K-T
8. Lin M-C.
2018Histone deacetylase 2 (HDAC2) attenuates lipopolysaccharide (LPS)-induced inflammation by regulating PAI-1 expressionJournal of Inflammation 15:3Google Scholar
55.
1. Ptasinska A
2. Pickin A
3. Assi SA
4. Chin PS
5. Ames L
6. Avellino R
7. Gröschel S
8. Delwel R
9. Cockerill PN
10. Osborne CS
11. Bonifer C.
2019RUNX1-ETO Depletion in t(8;21) AML Leads to C/EBPα- and AP-1-Mediated Alterations in Enhancer-Promoter InteractionCell Reports 28:3022–31Google Scholar
56.
1. Martinez-Soria N
2. McKenzie L
3. Draper J
4. Ptasinska A
5. Issa H
6. Potluri S
7. Blair HJ
8. Pickin A
9. Isa A
10. Chin PS
11. Tirtakusuma R
12. Coleman D
13. Nakjang S
14. Assi S
15. Forster V
16. Reza M
17. Law E
18. Berry P
19. Mueller D
20. Osborne C
21. Elder A
22. Bomken SN
23. Pal D
24. Allan JM
25. Veal GJ
26. Cockerill PN
27. Wichmann C
28. Vormoor J
29. Lacaud G
30. Bonifer C
31. Heidenreich O.
2018The Oncogenic Transcription Factor RUNX1/ETO Corrupts Cell Cycle Regulation to Drive Leukemic TransformationCancer Cell 34:626–42Google Scholar
57.
1. Patrick R
2. Naval-Sanchez M
3. Deshpande N
4. Huang Y
5. Zhang J
6. Chen X
7. Yang Y
8. Tiwari K
9. Esmaeili M
10. Tran M
11. Mohamed AR
12. Wang B
13. Xia D
14. Ma J
15. Bayliss J
16. Wong K
17. Hun ML
18. Sun X
19. Cao B
20. Cottle DL
21. Catterall T
22. Barzilai-Tutsch H
23. Troskie RL
24. Chen Z
25. Wise AF
26. Saini S
27. Soe YM
28. Kumari S
29. Sweet MJ
30. Thomas HE
31. Smyth IM
32. Fletcher AL
33. Knoblich K
34. Watt MJ
35. Alhomrani M
36. Alsanie W
37. Quinn KM
38. Merson TD
39. Chidgey AP
40. Ricardo SD
41. Yu D
42. Jardé T
43. Cheetham SW
44. Marcelle C
45. Nilsson SK
46. Nguyen Q
47. White MD
48. Nefzger CM
2024The activity of early-life gene regulatory elements is hijacked in aging through pervasive AP-1-linked chromatin openingCell Metab 36:1858–81Google Scholar
58.
1. Robinson MD
2. McCarthy DJ
3. Smyth GK
2010edgeR: a Bioconductor package for differential expression analysis of digital gene expression dataBioinformatics 26:139–40Google Scholar
59.
1. Love MI
2. Huber W
3. Anders S.
2014Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2Genome Biology 15:550Google Scholar
60.
1. Holmfeldt P
2. Ganuza M
3. Marathe H
4. He B
5. Hall T
6. Kang G
7. Moen J
8. Pardieck J
9. Saulsberry AC
10. Cico A
11. Gaut L
12. McGoldrick D
13. Finkelstein D
14. Tan K
15. McKinney-Freeman S.
2016Functional screen identifies regulators of murine hematopoietic stem cell repopulationJ Exp Med 213:433–49Google Scholar
61.
1. Chen C
2. Yu W
3. Tober J
4. Gao P
5. He B
6. Lee K
7. Trieu T
8. Blobel GA
9. Speck NA
10. Tan K.
2019Spatial Genome Reorganization between Fetal and Adult Hematopoietic Stem CellsCell Rep 29:4200–11Google Scholar
62.
1. Unnikrishnan A
2. Papaemmanuil E
3. Beck D
4. Deshpande NP
5. Verma A
6. Kumari A
7. Woll PS
8. Richards LA
9. Knezevic K
10. Chandrakanthan V
11. Thoms JAI
12. Tursky ML
13. Huang Y
14. Ali Z
15. Olivier J
16. Galbraith S
17. Kulasekararaj AG
18. Tobiasson M
19. Karimi M
20. Pellagatti A
21. Wilson SR
22. Lindeman R
23. Young B
24. Ramakrishna R
25. Arthur C
26. Stark R
27. Crispin P
28. Curnow J
29. Warburton P
30. Roncolato F
31. Boultwood J
32. Lynch K
33. Jacobsen SEW
34. Mufti GJ
35. Hellstrom-Lindberg E
36. Wilkins MR
37. MacKenzie KL
38. Wong JWH
39. Campbell PJ
40. Pimanda JE
2017Integrative Genomics Identifies the Molecular Basis of Resistance to Azacitidine Therapy in Myelodysplastic SyndromesCell Reports 20:572–85Google Scholar
63.
1. Williams MS
2. Amaral FMR
3. Simeoni F
4. Somervaille TCP
2020A stress-responsive enhancer induces dynamic drug resistance in acute myeloid leukemiaThe Journal of Clinical Investigation 130:1217–32Google Scholar
64.
1. Xu H
2. Muise ES
3. Javaid S
4. Chen L
5. Cristescu R
6. Mansueto MS
7. Follmer N
8. Cho J
9. Kerr K
10. Altura R
11. Machacek M
12. Nicholson B
13. Addona G
14. Kariv I
15. Chen H.
2019Identification of predictive genetic signatures of Cytarabine responsiveness using a 3D acute myeloid leukaemia modelJournal of Cellular and Molecular Medicine 23:7063–77Google Scholar
65.
1. Zhang H
2. Nakauchi Y
3. Köhnke T
4. Stafford M
5. Bottomly D
6. Thomas R
7. Wilmot B
8. McWeeney SK
9. Majeti R
10. Tyner JW
2020Integrated analysis of patient samples identifies biomarkers for venetoclax efficacy and combination strategies in acute myeloid leukemiaNature Cancer 1:826–39Google Scholar
66.
1. Aliee H
2. Theis FJ
2021AutoGeneS: Automatic gene selection using multi-objective optimization for RNA-seq deconvolutionCell Syst 12:706–15Google Scholar
67.
1. Eferl R
2. Wagner EF
2003AP-1: a double-edged sword in tumorigenesisNature Reviews Cancer 3:859–68Google Scholar
68.
1. Wang L
2. Gural A
3. Sun X-J
4. Zhao X
5. Perna F
6. Huang G
7. Hatlen MA
8. Vu L
9. Liu F
10. Xu H
11. Asai T
12. Xu H
13. Deblasio T
14. Menendez S
15. Voza F
16. Jiang Y
17. Cole PA
18. Zhang J
19. Melnick A
20. Roeder RG
21. Nimer SD
2011The Leukemogenicity of AML1-ETO Is Dependent on Site-Specific Lysine AcetylationScience 333:765–9Google Scholar
69.
1. Crowley SJ
2. Bednarski JJ
3. Magee JA
4. Li Y
5. White LS
6. Yang W.
2022BCLAF1 Regulates Expression of AP-1 Genes and Fetal Hematopoietic Stem Cell Repopulation ActivityBlood 140:2852–3Google Scholar
70.
1. Dell’Aversana C
2. Giorgio C
3. D’Amato L
4. Lania G
5. Matarese F
6. Saeed S
7. Di Costanzo A
8. Belsito Petrizzi V
9. Ingenito C
10. Martens JHA
11. Pallavicini I
12. Minucci S
13. Carissimo A
14. Stunnenberg HG
15. Altucci L.
2017miR-194-5p/BCLAF1 deregulation in AML tumorigenesisLeukemia 31:2315–25Google Scholar
71.
1. Tsherniak A
2. Vazquez F
3. Montgomery PG
4. Weir BA
5. Kryukov G
6. Cowley GS
7. Gill S
8. Harrington WF
9. Pantel S
10. Krill-Burger JM
11. Meyers RM
12. Ali L
13. Goodale A
14. Lee Y
15. Jiang G
16. Hsiao J
17. Gerath WFJ
18. Howell S
19. Merkel E
20. Ghandi M
21. Garraway LA
22. Root DE
23. Golub TR
24. Boehm JS
25. Hahn WC
2017Defining a Cancer Dependency MapCell 170:564–76Google Scholar
72.
1. Asou H
2. Tashiro S
3. Hamamoto K
4. Otsuji A
5. Kita K
6. Kamada N.
1991Establishment of a Human Acute Myeloid Leukemia Cell Line (Kasumi-1) With 8;21 Chromosome TranslocationBlood 77:2031–6Google Scholar
73.
1. Matozaki S
2. Nakagawa T
3. Kawaguchi R
4. Aozaki R
5. Tsutsumi M
6. Murayama T
7. Koizumi T
8. Nishimura R
9. Isobe T
10. Chihara K.
1995Establishment of a myeloid leukaemic cell line (SKNO-1) from a patient with t(8;21) who acquired monosomy 17 during disease progressionBr J Haematol 89:805–11Google Scholar
74.
1. Körber V
2. Stainczyk SA
3. Kurilov R
4. Henrich KO
5. Hero B
6. Brors B
7. Westermann F
8. Höfer T.
2023Neuroblastoma arises in early fetal development and its evolutionary duration predicts outcomeNat Genet 55:619–630Google Scholar
75.
1. Cabezas-Wallscheid N
2. Eichwald V
3. de Graaf J
4. Löwer M
5. Lehr HA
6. Kreft A
7. Eshkind L
8. Hildebrandt A
9. Abassi Y
10. Heck R
11. Dehof AK
12. Ohngemach S
13. Sprengel R
14. Wörtge S
15. Schmitt S
16. Lotz J
17. Meyer C
18. Kindler T
19. Zhang DE
20. Kaina B
21. Castle JC
22. Trumpp A
23. Sahin U
24. Bockamp E.
2013Instruction of haematopoietic lineage choices, evolution of transcriptional landscapes and cancer stem cell hierarchies derived from an AML1-ETO mouse modelEMBO Mol Med 5:1804–20Google Scholar
76.
1. Singh SK
2. Kagalwala MN
3. Parker-Thornburg J
4. Adams H
5. Majumder S.
2008REST maintains self-renewal and pluripotency of embryonic stem cellsNature 453:223–7Google Scholar
77.
1. Dahl JA
2. Jung I
3. Aanes H
4. Greggains GD
5. Manaf A
6. Lerdrup M
7. Li G
8. Kuan S
9. Li B
10. Lee AY
11. Preissl S
12. Jermstad I
13. Haugen MH
14. Suganthan R
15. Bjørås M
16. Hansen K
17. Dalen KT
18. Fedorcsak P
19. Ren B
20. Klungland A.
2016Broad histone H3K4me3 domains in mouse oocytes modulate maternal-to-zygotic transitionNature 537:548–52Google Scholar
78.
1. Nicosia L
2. Spencer GJ
3. Brooks N
4. Amaral FMR
5. Basma NJ
6. Chadwick JA
7. Revell B
8. Wingelhofer B
9. Maiques-Diaz A
10. Sinclair O
11. Camera F
12. Ciceri F
13. Wiseman DH
14. Pegg N
15. West W
16. Knurowski T
17. Frese K
18. Clegg K
19. Campbell VL
20. Cavet J
21. Copland M
22. Searle E
23. Somervaille TCP
2023Therapeutic targeting of EP300/CBP by bromodomain inhibition in hematologic malignanciesCancer Cell 41:2136–53Google Scholar
79.
1. Crowley S
2. White LS
3. Li Y
4. Yang W
5. Magee JA
6. Bednarski JJ
2022Bclaf1 promotes hematopoietic stem cell repopulating capacity and self-renewalThe Journal of Immunology 208Google Scholar
80.
1. Wolf FA
2. Angerer P
3. Theis FJ
2018SCANPY: large-scale single-cell gene expression data analysisGenome Biology 19:15Google Scholar
81.
1. Wolock SL
2. Lopez R
3. Klein AM
2019Scrublet: Computational Identification of Cell Doublets in Single-Cell Transcriptomic DataCell Systems 8:281–91Google Scholar
82.
1. Heumos L
2. Schaar A
2023Consortium S-CBPSingle-cell best practices https://www.sc-best-practices.org/preamble.html#
83.
1. Büttner M
2. Miao Z
3. Wolf FA
4. Teichmann SA
5. Theis FJ
2019A test metric for assessing single-cell RNA-seq batch correctionNature Methods 16:43–9Google Scholar
84.
1. Conde Domínguez
2. Xu C
3. Jarvis LB
4. Rainbow DB
5. Wells SB
6. Gomes T
7. Howlett SK
8. Suchanek O
9. Polanski K
10. King HW
11. Mamanova L
12. Huang N
13. Szabo PA
14. Richardson L
15. Bolt L
16. Fasouli ES
17. Mahbubani KT
18. Prete M
19. Tuck L
20. Richoz N
21. Tuong ZK
22. Campos L
23. Mousa HS
24. Needham EJ
25. Pritchard S
26. Li T
27. Elmentaite R
28. Park J
29. Rahmani E
30. Chen D
31. Menon DK
32. Bayraktar OA
33. James LK
34. Meyer KB
35. Yosef N
36. Clatworthy MR
37. Sims PA
38. Farber DL
39. Saeb-Parsy K
40. Jones JL
41. Teichmann SA
2022Cross-tissue immune cell analysis reveals tissue-specific features in humansScience 376:eabl5197Google Scholar
85.
1. Aran D
2. Looney AP
3. Liu L
4. Wu E
5. Fong V
6. Hsu A
7. Chak S
8. Naikawadi RP
9. Wolters PJ
10. Abate AR
11. Butte AJ
12. Bhattacharya M.
2019Reference-based analysis of lung single-cell sequencing reveals a transitional profibrotic macrophageNat Immunol 20:163–72Google Scholar
86.
1. Novershtern N
2. Subramanian A
3. Lawton LN
4. Mak RH
5. Haining WN
6. McConkey ME
7. Habib N
8. Yosef N
9. Chang CY
10. Shay T
11. Frampton GM
12. Drake AC
13. Leskov I
14. Nilsson B
15. Preffer F
16. Dombkowski D
17. Evans JW
18. Liefeld T
19. Smutko JS
20. Chen J
21. Friedman N
22. Young RA
23. Golub TR
24. Regev A
25. Ebert BL
2011Densely interconnected transcriptional circuits control cell states in human hematopoiesisCell 144:296–309Google Scholar
87.
1. Ianevski A
2. Giri AK
3. Aittokallio T.
2022Fully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic dataNature Communications 13:1246Google Scholar
88.
1. Wu T
2. Hu E
3. Xu S
4. Chen M
5. Guo P
6. Dai Z
7. Feng T
8. Zhou L
9. Tang W
10. Zhan L
11. Fu X
12. Liu S
13. Bo X
14. Yu G.
2021clusterProfiler 4.0: A universal enrichment tool for interpreting omics dataThe Innovation 2Google Scholar

Article and author information

Author information

Jessica Whittle
Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom, Stem Cell Biology Group, Cancer Research UK Manchester Institute, The University of Manchester, Manchester, United Kingdom
Stefan Meyer
Manchester Cancer Research Centre (MCRC), Division of Cancer Sciences, School of Medical Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom, Department of Paediatric and Adolescent Oncology, Royal Manchester Children’s Hospital, Manchester, United Kingdom, Department of Adolescent Oncology, The Christie NHS Foundation Trust, Manchester, United Kingdom
Georges Lacaud
Stem Cell Biology Group, Cancer Research UK Manchester Institute, The University of Manchester, Manchester, United Kingdom
ORCID iD: 0000-0002-5630-2417
- For correspondence: georges.lacaud@cruk.manchester.ac.uk
Syed Murtuza Baker
Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
- For correspondence: syed.murtuzabaker@manchester.ac.uk
Mudassar Iqbal
Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, United Kingdom
ORCID iD: 0000-0002-5006-4331
- For correspondence: mudassar.iqbal@manchester.ac.uk

Author Notes

Competing interests: No competing interests declared

Version history

Sent for peer review: January 15, 2025
Preprint posted: January 17, 2025
Reviewed Preprint version 1: May 15, 2025
Reviewed Preprint version 2: November 12, 2025
Version of Record published: February 11, 2026

Cite all versions

You can cite all versions using the DOI https://doi.org/10.7554/eLife.104978. This DOI represents all versions, and will always resolve to the latest one.

Copyright

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

views: 4,324
downloads: 537
citations: 4

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Significance of findings

Strength of evidence

Abstract

Background

Results

Conclusions

Introduction

Results

Large Scale Data Integration to Construct a Single-Cell Transcriptomic Atlas of AML (AML scAtlas)

Large Scale Data Integration Creates a Single-Cell Atlas of AML

Characterizing Cell Type Distributions in AML Subtypes

AML scAtlas Reveals Age-Associated Heterogeneity in t(8;21) AML

Application of AML scAtlas to Identifying Age-Associated Gene Regulatory Networks in t(8;21) AML

Validation of Age-Associated Regulons in Large Bulk RNA-Seq Cohorts

Validation of Age-Associated Regulons in Bulk-RNA-Seq Cohorts of t(8;21) AML

Combining Multiomics Data Interrogates Age-Associated Regulons

Multiomics Single-Cell Data Reveals a Denoised GRN and Identifies Candidate Perturbations in Prenatal Origin t(8;21) AML

Discussion

Conclusions

Methods

Data Collection

Initial Data Processing

Batch Correction

AML scAtlas Cell Type Annotation

AML scAtlas LSC Annotation

Bulk RNA-Seq Analysis

Single-Cell Multi-Omics Analysis

Acknowledgements

Additional information

Data availability

Author Contributions

Funding

Additional files

References

Article and author information

Author information

Jessica Whittle

Stefan Meyer

Georges Lacaud

Syed Murtuza Baker

Mudassar Iqbal

Author Notes

Version history

Cite all versions

Copyright

Metrics