1. Cancer Biology
Download icon

Single-cell chromatin accessibility profiling of glioblastoma identifies an invasive cancer stem cell population associated with lower survival

  1. Paul Guilhamon
  2. Charles Chesnelong
  3. Michelle M Kushida
  4. Ana Nikolic
  5. Divya Singhal
  6. Graham MacLeod
  7. Seyed Ali Madani Tonekaboni
  8. Florence MG Cavalli
  9. Christopher Arlidge
  10. Nishani Rajakulendran
  11. Naghmeh Rastegar
  12. Xiaoguang Hao
  13. Rozina Hassam
  14. Laura J Smith
  15. Heather Whetstone
  16. Fiona J Coutinho
  17. Bettina Nadorp
  18. Katrina I Ellestad
  19. H Artee Luchman
  20. Jennifer Ai-wen Chan
  21. Molly S Shoichet
  22. Michael D Taylor
  23. Benjamin Haibe-Kains
  24. Samuel Weiss
  25. Stephane Angers
  26. Marco Gallo
  27. Peter B Dirks  Is a corresponding author
  28. Mathieu Lupien  Is a corresponding author
  1. Princess Margaret Cancer Centre, University Health Network, Canada
  2. Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Canada
  3. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Canada
  4. Alberta Children’s Hospital Research Institute, University of Calgary, Canada
  5. Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Canada
  6. Leslie Dan Faculty of Pharmacy, University of Toronto, Canada
  7. Department of Medical Biophysics, University of Toronto, Canada
  8. Hotchkiss Brain Institute, University of Calgary, Canada
  9. Department of Cell Biology & Anatomy, University of Calgary, Canada
  10. Institute of Biomaterials and Biomedical Engineering, University of Toronto, Canada
  11. Department of Pathology and Laboratory Medicine, University of Calgary, Canada
  12. Division of Neurosurgery, University of Toronto, Canada
  13. Departments of Molecular Genetics and Surgery, University of Toronto, Canada
  14. Department of Computer Science, University of Toronto, Canada
  15. Ontario Institute for Cancer Research, Canada
  16. Vector Institute, Canada
  17. Department of Physiology & Pharmacology, University of Calgary, Canada
  18. Department of Biochemistry, Faculty of Medicine, University of Toronto, Canada
Research Article
  • Cited 1
  • Views 2,500
  • Annotations
Cite this article as: eLife 2021;10:e64090 doi: 10.7554/eLife.64090

Abstract

Chromatin accessibility discriminates stem from mature cell populations, enabling the identification of primitive stem-like cells in primary tumors, such as glioblastoma (GBM) where self-renewing cells driving cancer progression and recurrence are prime targets for therapeutic intervention. We show, using single-cell chromatin accessibility, that primary human GBMs harbor a heterogeneous self-renewing population whose diversity is captured in patient-derived glioblastoma stem cells (GSCs). In-depth characterization of chromatin accessibility in GSCs identifies three GSC states: Reactive, Constructive, and Invasive, each governed by uniquely essential transcription factors and present within GBMs in varying proportions. Orthotopic xenografts reveal that GSC states associate with survival, and identify an invasive GSC signature predictive of low patient survival, in line with the higher invasive properties of Invasive state GSCs compared to Reactive and Constructive GSCs as shown by in vitro and in vivo assays. Our chromatin-driven characterization of GSC states improves prognostic precision and identifies dependencies to guide combination therapies.

Introduction

Glioblastoma (GBM) is a lethal form of brain cancer with standard surgery and radiation giving a median survival of only 12.6 months (Carlsson et al., 2014). The addition of temozolomide chemotherapy provides only an additional 2.5 months in the small subset of responsive patients (Stupp et al., 2005). Despite extensive characterization and stratification of the bulk primary tumors, no targeted therapies have been successfully developed (Carlsson et al., 2014; von Neubeck et al., 2015). GBM tumors are rooted in self-renewing tumor-initiating cells commonly referred to as glioblastoma stem cells (GSCs) (Venere et al., 2011) that drive disease progression in vivo (Chen et al., 2012; Gallo et al., 2015) and display resistance to chemo- and radiotherapy leading to disease recurrence (Bao et al., 2006). The promise of therapeutically targeting self-renewing tumor-initiating cancer cells depends on our capacity to capture the full range of heterogeneity within this population from individual tumors. Intratumoral heterogeneity within primary GBM has recently been documented through single-cell RNA-seq experiments and revealed a continuum between four cellular states (Neftel et al., 2019): neural-progenitor-like (NPC), oligodendrocyte-progenitor-like (OPC), astrocyte-like (AC), and mesenchymal-like (MES) (Neftel et al., 2019). A subsequent study (Wang et al., 2019) using single-cell gene-centric enrichment analysis placed GBM cells along a single axis of variation from proneural to mesenchymal transcriptional profiles, with cells expressing stem-associated genes lying at the extremes of this axis. Hence, primary GBM consists of distinct states, across which stem-like cells appear to be found. Whether these stem-like cells found across GBM states represent functionally distinct GSC populations with tumor-initiating properties and unique dependencies remains to be established to guide therapeutic progress. To address this issue, we combined single-cell technologies to define GSC composition in primary GBM with functional assays to reveal the unique dependencies across GSCs, reflective of invasive, constructive, and reactive states that relate to patient outcome.

Results

Chromatin accessibility readily discriminates stem from mature cell populations (Stergachis et al., 2013), which can be resolved at the single-cell level through single-cell ATAC-seq (Buenrostro et al., 2018; Corces et al., 2016), taking into account non-gene centric features, such as accessibility of noncoding elements and total amount of accessible DNA sequences. Applying single-cell chromatin accessibility profiling (scATAC-seq) across four primary adult GBM tumors (3797 cells), wild type for both IDH1 and IDH2, revealed seven to nine accessibility modules in each tumor based on unsupervised clustering (Figure 1A). We assigned cells to each of the four scRNA-seq-derived cellular states (Neftel et al., 2019) based on individual cells’ chromatin accessibility enrichment scores for the promoter regions of each state’s signature genes. Across the four tumors, 35–55.2% of the cells were significantly enriched for at least one state’s signature genes (Figure 1—figure supplement 1A). The MES state reported from scRNA-seq (Neftel et al., 2019) dominates the identity of two or more modules reported from chromatin accessibility in every tumor (Figure 1B,C, Figure 1—figure supplement 1B). In contrast, the NPC and OPC states are mixed within the same module defined based on chromatin accessibility, dominating over the other states typically in at least two modules. Cells assigned to the AC state did not preferentially cluster within a single module reported from chromatin accessibility (Figure 1B,C, Figure 1—figure supplement 1B). Collectively, our results suggest that chromatin accessibility reflects a greater stratification of the MES state and detects similarities between the OPC and NPC states and heterogeneity within the AC state.

Figure 1 with 1 supplement see all
The diverse glioblastoma (GBM) cancer stem cell pool.

(A) UMAP (Uniform Manifold Approximation and Projection) representation of chromatin accessibility across four primary GBM. (B) UMAPs with tumor cells assigned to cellular states. (C) UMAP modules are grouped by dominant cellular state. (D) UMAPs with cancer stem cells highlighted based on the enrichment of GBM cancer stem transcription factor promoters. (E) Distribution of cancer stem cells across the modules dominated by each cellular state.

To identify putative cancer stem cells within each primary tumor, we next focused on the level of chromatin accessibility at promoters of 19 transcription factors previously associated with self-renewal and tumor-propagating capacity in GBM (Suvà et al., 2014; Figure 1D). Individual cells scoring as putative cancer stem cells were not restricted to a unique module defined by chromatin accessibility but were distributed across a subset of modules, suggesting heterogeneity across cancer stem cells in primary GBM, in agreement with reports relying on single GBM cell labeling (Corces et al., 2016; Gallo et al., 2015; Lan et al., 2017; Liau et al., 2017; Meyer et al., 2015; Miller et al., 2017; Orzan et al., 2017; Park et al., 2017; Rheinbay et al., 2013; Stergachis et al., 2013; Suvà et al., 2014), assessing the heterogeneity of self-renewing tumor-initiating cells. Putative cancer stem cells identified in primary GBM through scATAC-seq were found in modules ascribed to every one of the four cellular states defined by gene expression (Neftel et al., 2019), predominantly within NPC and OPC containing modules and a smaller fraction (<10%) in MES-specific modules across all four tumors (Figure 1E). This suggests that the core transcriptional unit of cancer stem cells in primary GBM (Suvà et al., 2014) is not restricted to a unique population defined by its global transcriptional or chromatin accessibility profile with the resolution achieved with current single-cell technologies.

To further probe the heterogeneity in chromatin accessibility within the GBM cancer stem cell pool, we derived GSC populations from 27 adult IDH wild-type GBM tumors (Pollard et al., 2009) and profiled their chromatin accessibility by bulk ATAC-seq (Figure 2A). Each patient-derived GSC showed a similar enrichment for accessible chromatin regions in promoters and 5′UTRs, and depletion in introns and distal intergenic regions (Figure 2B). Collectively, we uncovered 92% of the total predicted regions of accessible chromatin (255,891 regions) within GSCs based on a saturation analysis using a self-starting non-linear regression model across the 27 samples (Figure 2C). We next assessed the similarity between these GSCs and the putative cancer stem cells found by scATAC-seq in the four primary GBMs. GSCs were identified within each tumor by calculating the enrichment of accessible chromatin regions shared by a majority of GSCs (>14/27) in each tumor cell (Figure 2D). On average, 11.3% of cells in each primary GBM were labeled as GSCs. Comparing the distribution of GSCs across the seven to nine modules defined by scATAC-seq to that of the 19 transcription factor-derived cancer stem cell signature demonstrates concordance between the two signatures (Figure 2E). Moreover, the enrichment z-scores for both cancer stem signatures (i.e. stem transcription factors signature and GSC chromatin accessibility signature) are significantly correlated across cells in all four tumors (p≤1.6×10−5) (Figure 2F). The overlap in cells significantly enriched for either stem signatures across the four tumors is also significant (hypergeometric test p-value=8.8×10−8) Additionally, an average of 91.2% (85.1–100%) of the cells identified by either signature display the hallmark GBM copy number changes at chromosomes 7 and 10, confirming their neoplastic status (Figure 2—figure supplement 1). Collectively, these results demonstrate that the patient-derived GSC populations reflect the chromatin identity of putative cancer stem cells found in primary brain tumors, highlighting the value of these GSCs as models to deepen our understanding of individual cells within primary GBM with features found in self-renewing tumor-initiating cells. Accordingly, spectral clustering of the 27 patient-derived GSC ATAC profiles identifies three distinct states of self-renewing tumor-initiating cells (Figure 3A). Expression profiling of these 27 GSCs by RNA-seq reveals GSCs significantly enriched for the signatures of each of the three TCGA GBM subtypes (proneural, classical, and mesenchymal) (Wang et al., 2017; Figure 3B). However, the assignment of the proneural, classical, and mesenchymal subtypes across GSCs did not match the three clusters identified from ATAC-seq (Figure 3B). Conversely, clustering the GSCs by gene expression, independently of their chromatin accessibility, did largely recapitulate the GSC states defined by chromatin accessibility (Figure 3B). This suggests that the mismatch between the GSC cluster from chromatin accessibility profiles and TCGA expression subtypes is not mainly due to differences between chromatin accessibility and gene expression. Potential alternative causes for this observed mismatch include the absence of a tumor microenvironment in the GSC populations or that given the TCGA subtypes were determined from bulk GBM, they may not fully capture the nature of rarer populations found within a tumor, such as the cancer stem cell populations.

Figure 2 with 1 supplement see all
GSCs recapitulate the glioblastoma (GBM) cancer stem cell population.

(A) Schematic representation of the GSC derivation process, from patient tumor to GSC-enriched population. (B) Genomic feature enrichment of accessible chromatin peaks. (C) Saturation curve for the 27 GSCs. (D) UMAPs with GSCs highlighted based on the enrichment of shared accessible regions across GSCs. (E) Proportion of UMAP modules assigned to cancer stem cells and GSCs. (F) Correlation of z-scores for each signature for each cell in each primary GBM.

Figure 3 with 1 supplement see all
Three glioblastoma stem cell (GSC) states driven by chromatin accessibility.

(A) Spectral clustering of ATAC-seq signal across peaks in 27 GSCs. (B) Spectral clustering of gene expression across GSCs and comparison to chromatin-derived GSC states. Enrichment of TCGA subtypes across GSCs and comparison to GSC states as determined by ATAC. The GSEA (Gene Set Enrichment Analysis) gene sets each contained 50 genes and the enrichment scores ranged from 0.16 to 0.73. (C) Gene set enrichment analysis in each GSC state. All displayed terms are significantly enriched (q-value < 0.05). (D) Copy number alterations (CNAs) across GSCs identified from DNA methylation array data cluster GSCs into five subgroups. (E) Number and percentage of peaks unique and shared in each GSC state. (F) Saturation analysis of each individual state.

Gene set enrichment analysis with GSEA (Gene Set Enrichment Analysis) (Subramanian et al., 2005) and g:profiler (Reimand et al., 2016) using genes exclusively enriched for both expression and promoter chromatin accessibility in each subtype reported significantly enriched terms defining the largest GSC state as a Reactive state, with terms related to immune cells and response (Figure 3C, top panel). A second GSC state was enriched for Constructive gene sets involved in brain, neuron, and glial cell development (Figure 3C, middle panel). The third and smallest GSC state presented an Invasive state characterized by terms relating to the extracellular matrix and angiogenesis (Figure 3C, bottom panel). We next mapped copy number alterations (CNAs) across the 27 GSCs by applying the molecular neuropathology classifier tool (Capper et al., 2018) to DNA methylation data from the GSCs (Figure 3D). While common GBM CNAs, including EGFR gains and CDKN2A/B loss, were observed, the CNA-based classification of GSCs failed to match the three chromatin accessibility-derived states, suggesting the three GSC states are not defined by somatic copy number events (Figure 3D). Further comparison of the accessible chromatin in each GSC state reveals that only a small subset of accessible chromatin regions drives the three GSC states (Figure 3E, Figure 3—figure supplement 1). Our ability to discriminate GSC state-specific regions of accessible chromatin is reflective of the comprehensiveness of our cohort to saturate the detection of accessible regions to 93%, 88%, and 71% across the Reactive (n = 13), Constructive (n = 9), and Invasive (n = 5) state GSCs, respectively (Figure 3F).

Considering that regions of accessible chromatin serve as binding sites for transcription factors engaging in gene expression regulation, we next tested for DNA recognition motif family enrichment across regions exclusively accessible in Reactive, Constructive, or Invasive GSC states (Figure 4A, Figure 4—figure supplement 1A). The most enriched DNA recognition motif families in each state were either depleted or showed low-level enrichment in the other states. Specifically, the DNA recognition motifs for the interferon-regulatory factor (IRF) and Cys2-His2 zinc finger (C2H2 ZF) transcription factor families were enriched in the Reactive state (Figure 4A, top panel). Regulatory factor X (RFX) and basic helix-loop-helix (bHLH) DNA recognition motifs were enriched in the Constructive state (Figure 4A, middle panel), while the Forkhead motif family was enriched in the Invasive state (Figure 4A, bottom panel). Genome-wide CRISPR/Cas9 essentiality screens (Figure 4—figure supplement 1B) in three Reactive, two Constructive and one Invasive GSC (MacLeod et al., 2019) revealed the preferential requirement for expressed transcription factors (Figure 4—figure supplement 1C–E) recognizing the enriched DNA recognition motif in a state-specific manner (Figure 4B). Specifically, the SP1 regulatory network is preferentially essential in the Reactive state GSCs (Figure 4B, top panel), ASCL1, OLIG2, AHR, and NPAS3 are uniquely essential in the Constructive state GSCs (Figure 4B, middle panel) and FOXD1 is essential only in the Invasive state GSC (Figure 4B, bottom panel). Notably, SP1 itself is exclusively essential in only one Reactive GSC (G564). However, of the 36 transcription factors from the Reactive-enriched families (IRF and C2H2 ZF) that were essential in at least one Reactive GSC and not in any of the Constructive or Invasive GSCs, 13 are directly regulated by SP1 (Figure 4—figure supplement 1C), thus suggesting that the SP1 regulatory network as a whole, rather than SP1 on its own, is key in the Reactive GSC state. Notably, all six transcription factors display significantly higher expression in GBM compared to normal brain (Tang et al., 2017; Figure 4—figure supplement 1F), further supporting their function as key regulators of tumor initiation and development. An additional gene set enrichment analysis combining genes exclusively essential to each state with the putative targets of the key transcription factors outlined above identifies additional enriched terms supporting the identities of the three GSC states as Reactive, Constructive, and Invasive (Figure 4—figure supplement 1G).

Figure 4 with 1 supplement see all
Functional diversity between glioblastoma stem cell (GSC) states and intra-tumor heterogeneity.

(A) Motif family enrichment in each cluster; log2(fold enrichment) > 0.5 threshold selected based on the distribution of values in each cluster (Figure 2—figure supplement 1). (B) Z-score distribution of key essential genes in each cluster. Red line corresponds to the empirically determined threshold for essentiality in each tested line, scaled, and adjusted to zero. Boxplot whiskers in this case extend to data extremes. Side barplots show the total count of the key cluster-specific regulators found essential in each subtype. (C) UMAPs with GSCs from each state highlighted based on the enrichment of the top differentially accessible regions in each GSC state.

Previous work suggests that GBM tumors harbor a heterogeneous population of GSCs (Lan et al., 2017; Meyer et al., 2015; Patel et al., 2014). We therefore quantified the presence of Reactive, Constructive, and Invasive cancer stem cells in our four primary GBM based on their scATAC-seq profiles. The Constructive state was dominant in every primary tumor, ranging from 9 to 21% of all cells captured by scATAC-seq (Figure 4C). The Reactive and Invasive states accounted for only 0–9% of all cells, in varying proportions from one tumor to another (Figure 4C). Collectively, our results further support the heterogeneous nature of cancer stem cells that populate primary tumors.

While various classifications of GBMs and/or their constitutive bulk and stem tumor cells have been reported, with some associating with patient survival (Neftel et al., 2019; Patel et al., 2014; Cancer Genome Atlas Research Network et al., 2010; Wang et al., 2017; Yin et al., 2019; Zuo et al., 2019), molecular signatures in adult GBM stratifying the poorer prognosis IDH wild-type patients by survival are lacking. We performed intracranial xenografts of 37 IDH wild-type GSC populations and classified the transplanted cells by their GSC state to perform a differential survival analysis (Figure 5A). The overall survival times of the transplanted mice grouped by GSC state were significantly different (LogRank test p=0.041), with the Invasive state GSCs leading to the worst prognosis. Relative to the mice injected with Reactive state GSCs, mice with Constructive state GSCs and Invasive state GSCs had hazard ratios of 1.3 (95% confidence interval [CI]: 0.57–2.97) and 3.5 (95% CI: 1.2–10.49), respectively.

Figure 5 with 2 supplements see all
Glioblastoma stem cell (GSC) invasiveness is associated with survival in glioblastoma (GBM).

(A) Kaplan–Meier plot for orthotopic xenografts grouped by GSC state. For each of the 37 GSCs used in the xenograft survival analysis, the median survival value was used from multiple mice injected with cells from each GSC (average number of mice injected/GSC = 5). The dotted lines indicate median survival. The pairwise p-values are also significant for Invasive vs Reactive (p=0.02) and Invasive vs Constructive (p=0.045) but not for Reactive vs Constructive (p=0.45). (B) Invaded area over time normalized to t0 by three to four representative GSCs of each GSC state and human fetal neural stem cells (hfNSCs) used as controls. See Figure 5—figure supplement 1A for individual GSC invasion. (C) Invasion images of representative hfNSCs and GSCs at t0, 6 hr, and 12 hr. Scale bar is 150 µm. (D) TCGA samples ordered by increasing concordance with Invasive GSCs and grouped into three subgroups: <1, 1–1.65, >1.65. (E) Kaplan–Meier plot for TCGA samples grouped by concordance with Invasive GSCs. The dotted line indicates median survival. When considering pairwise comparisons, only the Invasive-high and Invasive-low subgroups were significantly different (p=0.0043). Further subgrouping of the TCGA samples into smaller intervals of concordance z-score yielded no benefit, preserving the Invasive-high subgroup as the only one with significantly poorer prognosis (Figure 5—figure supplement 2B,C).

In light of this finding, we sought to evaluate the invasive properties of the Invasive GSC state population. To that end, we conducted an invasion assay (Restall et al., 2018) wherein neurospheres from representative GSCs of each state (four Reactive, three Constructive, and three Invasive; three independent biological replicates for each GSC) are embedded in a collagen matrix and invasion is monitored in real time (Figure 5B,CFigure 5—figure supplement 1A). GSCs of the Invasive state invaded strikingly faster and further into the collagen matrix and overall displayed a three times greater invasive ability than that of the Reactive and Constructive states. This invasiveness was further confirmed in vivo through human-specific staining of mouse brains injected with Invasive-state GSCs (Figure 5—figure supplement 1B–C).

Next, we investigated the prognostic value of the GSC states using the TCGA GBM cohort (IDH1 and IDH2 wild type, n = 144). When classified by dominant GSC state, TCGA tumors display the same trend as the xenografts with Invasive state-dominated tumors showing the lowest survival (Figure 5—figure supplement 2A). However, with only two tumors classified as Invasive-dominant, the difference in survival between the three patient groups was not statistically significant (p=0.3) (Figure 5—figure supplement 2A). We proceeded to rank the TCGA tumors solely by their concordance to Invasive GSCs and classified the patient tumors into Invasive-low (z-score < 1), Invasive-mid (z-score = 1–1.65), and Invasive-high (z-score ≥ 1.65) groups (Figure 5D). With this stratification method, median patient survival per group not only decreased with increasing Invasive GSC score, but we also identified an Invasive-high subset of tumors with significantly lower survival (p=0.019, HR (Hazard Ratio) = 2.8, 95% CI: [1.3–5.81]) (Figure 5E). This was further validated using an additional cohort of TCGA samples with microarray gene expression (Figure 5—figure supplement 2D–E). These results show that cancer stem cell states defined based on the chromatin accessibility in GSCs can identify transcriptional programs associated with poor prognosis and can serve as a signature to identify high-risk patients in IDH wild-type GBM.

Discussion

Defining the nature of self-renewing tumor-initiating cells in primary GBM is required to identify vulnerabilities for therapeutic intervention. Quantifying their heterogeneity within tumors can guide treatment strategies and assist in predicting the course of disease progression. Here we show that chromatin accessibility assays capture a heterogeneity across self-renewing tumor-initiating cells in primary GBM that extends beyond their genetic diversity and underlies the heterogeneity in bulk progeny (Meyer et al., 2015). This heterogeneity aligns with diversity in the three-dimensional genome organization of GSCs (Johnston et al., 2019) and agrees with how the three-dimensional genome organization instructs cis-regulatory plexuses underlying gene regulation (Bailey et al., 2016; Kim et al., 2016; Sallari et al., 2017; Schmitt et al., 2016; Zheng and Xie, 2019). While the chromatin accessibility signature derived from GSCs robustly identified cancer stem cells across the four primary GBM tumors used in this study, a larger cohort will be required to establish whether this signature is always sufficient to capture all cancer stem cells in primary GBMs.

We further reveal a specific cancer stem state that is significantly predictive of patient survival and can be used as a signature to identify high-risk patients. Specifically, the Invasive GSC signature draws from a population with enhanced ability to invade and spread compared to other GSC states. Collectively, our results argue for distinct GSC populations whose composition in tumors impacts survival.

Our results also highlight dependencies unique to each cancer stem state. Specifically, the Reactive GSC state-associated transcription factor SP1 and its regulatory partners are involved in cellular differentiation and growth, apoptosis, response to DNA damage, chromatin remodeling (O'Connor et al., 2016), stimulation of TERT expression in cancer stem cells (Liu et al., 2016), and increased stemness and invasion in GBM (Lee et al., 2014). In contrast, the Constructive GSC state relies on transcription factors including OLIG2, a known GSC marker (Trépant et al., 2015); AHR involved in tumor microenvironment responses and metabolic adaptation (Gabriely et al., 2017); NPAS3, a regulator of Notch signaling and neurogenesis (Michaelson et al., 2017); and ASCL1, a critical regulator of GSC differentiation and marker of sensitivity to Notch inhibition in GSCs (Park et al., 2017; Rajakulendran et al., 2019). Finally, the Invasive GSC state relies on FOXD1, a pluripotency regulator and determinant of tumorigenicity in GSCs regulating expression of the aldehyde dehydrogenase ALDH1A3, a functional marker for invasive GSCs (Cheng et al., 2016; Koga et al., 2014). These results support developing combination therapy using targeting agents against each GSC state, such as Notch inhibitors (Park et al., 2017) and small molecule inhibitors of ALDH (Cheng et al., 2016), to eradicate self-renewing tumor-initiating cells with the hope to cure GBM patients.

Materials and methods

Patient samples and cell culture

Request a detailed protocol

All tissue samples were obtained following informed consent from patients, and all experimental procedures were performed in accordance with the Research Ethics Board at The Hospital for Sick Children (Toronto, Canada), the University of Calgary Ethics Review Board, and the Health Research Ethics Board of Alberta – Cancer Committee (HREBA). Approval to pathological data was obtained from the respective institutional review boards. Patient tumor tissue samples were dissociated in artificial cerebrospinal fluid followed by treatment with enzyme cocktail at 37°C. Patient tumor-derived GSCs were grown as adherent monolayer cultures in serum-free medium (SFM) as previously described (Pollard et al., 2009). Briefly, cells were grown adherently on culture plates coated with poly-l-ornithine and laminin. Serum-free NS cell self-renewal media (NS media) consisted of Neurocult NS-A Basal media, supplemented with 2 mmol/L l-glutamine, N2 and B27 supplements, 75 μg/mL bovine serum albumin, 10 ng/mL recombinant human EGF (rhEGF), 10 ng/mL basic fibroblast growth factor (bFGF), and 2 μg/mL heparin. A subset (22/37) of the GSCs used for orthotopic xenografts were grown as non-adherent spheres prior to single-cell dissociation and injection into the mice. Briefly, SFM was used to initiate GSC cultures. Non-adherent spheres formed after 7–21 days in culture and were expanded, then cryopreserved in 10% dimethyl sulfoxide (DMSO; Sigma-Aldrich) in SFM until used in experiments.

ATAC-seq

Request a detailed protocol

ATAC-seq was used to profile the accessible chromatin landscape of 27 patient tumor-derived GSCs. Fifty thousand cells were processed from each sample as previously described (Buenrostro et al., 2013; Corces et al., 2017). The resulting libraries were sequenced with 50 bp single-end reads, which were mapped to hg19. Reads were filtered to remove duplicates, unmapped or poor-quality (Q < 30) reads, mitochondrial reads, chrY reads, and those overlapping the ENCODE blacklist. Following alignment, accessible chromatin regions/peaks were called using MACS2. Default parameters were used except for the following: --keep-dup all -B --nomodel --SPMR -q 0.05 --slocal 6250 --llocal 6250. The signal intensity was calculated as the fold enrichment of the signal per million reads in a sample over a modeled local background using the bdgcmp function in MACS2. Spectral clustering implemented in the SNFtool package (Wang et al., 2014) was run on the SNF-fused similarity matrix to obtain the groups corresponding to k = 2–12. Enrichment for genomic features was calculated using CEAS (Shin et al., 2009). The GSC state labels for each sample can be found in Supplementary file 1, and the coordinates of discriminating ATAC regions for each GSC state are in Supplementary file 2.

A hypergeometric test was used to determine whether there was any sex bias within the three GSC states. All tests resulted in p-value>0.05 except in the Invasive cluster where three out of five patients are female, giving a p-value=0.047. Given the small size of the Invasive group and the impossibility of obtaining an even number of males and females in a group of 5, this result was not considered to indicate sex bias within the cohort.

A given chromatin region was considered exclusive to one of the clusters if it was called as a peak in any of the cluster’s samples using a q-value filter of 0.05 and was not called as a peak in any of the other samples using a q-value filter of 0.2, in order to ensure stringency of exclusivity.

The ATAC-seq saturation analysis was performed by randomizing the order of samples and successively calculating the number of additional peaks discovered with the addition of each new sample. This process was repeated 10,000 times and averaged. A self-starting non-linear regression model was then fitted to the data to estimate the level of saturation reached.

For the xenograft survival analysis, 11/37 GSCs used overlap with the cohort of 27 described above. The other 26/37 GSCs were profiled by ATAC-seq independently following the same protocol described above and assigned to a GSC state as described below. Similarly, samples G432, G440, and G472 used in the essentiality gene analysis were profiled by ATAC-seq independently following the same protocol described above and assigned to a GSC state as follows: the signal obtained from MACS2 for each sample was mapped to the peak catalog of the original cohort of 27 GSCs. Each sample was then allocated to a GSC state through unsupervised hierarchical clustering with the original cohort of 27 GSCs.

Single-cell ATAC-seq

Request a detailed protocol

The four tumors used were G4218 (primary GBM, IDH wt, male, 64 years), G4250 (primary GBM, IDH wt, male, 73 years), G4275 (primary GBM, IDH wt, female, 52 years), and G4349 (primary GBM, IDH wt, male, 62 years). Fragments of tumor were received fresh from the operating room, and blunt dissected into individual fragments of approximately 0.3–0.7 cm3. Each fragment was placed in 1 mL of freezing media (400 μL of NeuroCult NS-A Basal medium with proliferation supplement (StemCell Technologies; #05751) containing 20 μg/mL rhEGF (Peprotech, AF-100–15), 10 μg/mL bFGF (StemCell Technologies, #78003), and 2 μg/mL heparin (StemCell Technologies, #07980); 500 μL of 25% bovine serum albumin (BSA) (Millipore-Sigma; A9647) in Dulbecco's modified Eagle's medium, and 100 μL DMSO (Millipore-Sigma; D2650) in a 2 mL cryotube, and placed at −80 C in a CoolCell for at least 24 hr. Samples were then stored at −80C until use. Cryopreserved primary GBM samples were washed at 1000 RPM for 5 min in phosphate-buffered saline (PBS) to remove DMSO, and then transferred to 1.5 mL tubes. Samples were resuspended in cold ATAC resuspension buffer (10 mM Tris–HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% NP-40, 0.1% Tween-20, 0.01% Digitonin, 1% BSA in PBS) on ice and dissociated using a wide-bore P1000 pipette tip and vortexing, followed by 10 min of incubation on ice. Cells were spun down at 500 x g for 5 min at 4°C, washed in the ATAC resuspension buffer, spun down again, and resuspended in ATAC-Tween wash buffer (10 mM Tris–HCl pH 7.4, 10 mM NaCl, 3 mM MgCl2, 0.1% Tween-20, 1% BSA in PBS), then passed through a cell strainer top FACS tube (Falcon; #38030) to remove debris. Nuclei quality and quantity was evaluated using trypan blue on an Invitrogen Countess II device in duplicate, and a subset of nuclei was spun down in a fresh tube and resuspended in 10× sample dilution buffer. Nuclei were then used for single-cell ATAC-seq library construction using the Chromium Single Cell ATAC Solution v1.0 kit (10× Genomics) on a Chromium controller. Completed libraries were further quality checked for fragment size and distribution using an Agilent TapeStation prior to sequencing. Single-cell ATAC-seq samples were sequenced on a NextSeq 500 (Illumina) instrument with 50 bp paired-end reads at the Centre for Health Genomics and Informatics (CHGI) at the University of Calgary.

The raw sequencing data was demultiplexed using cellranger-atac mkfastq (Cell Ranger ATAC, version 1.0.0, 10× Genomics). Single-cell ATAC-seq reads were aligned to the hg19 reference genome (hg19, version 1.1.0, 10× Genomics) and quantified using cellranger-atac count function with default parameters (Cell Ranger ATAC, version 1.1.0, 10× Genomics). The resulting data were analyzed using the chromVAR (Schep et al., 2017) and Signac (Stuart et al., 2019) R packages (v1.4.1). The number of accessibility modules in each sample was determined using the ElbowPlot method implemented in Signac. Similarity between individual cells and GSC states was assessed using the deviation scores calculated by chromVAR within the single-cell data for significantly differentially accessible sets of peaks (fold change signal difference > 2 and Wilcoxon test q-value ≤ 0.05) between the states as determined by bulk ATAC-seq. Similarity between individual cells and the expression-derived cellular states was assessed using the deviation scores calculated by chromVAR within the single-cell data for promoter regions of the signature genes of each of the cellular states (Neftel et al., 2019). A twofold cut-off was used to determine the dominance of a UMAP module by an individual or group of cellular states. Similarity between individual cells and the GBM cancer stem cell signatures was assessed using the deviation scores calculated by chromVAR within the single-cell data for either promoter regions of the 19 transcription factors identified as markers of cancer stem cells in GBM (Suvà et al., 2014) or accessible chromatin regions shared by a majority of GSCs (>14/27). An average of 11.3% of cells in each tumor was identified as a GSC: G4218: 11.3%, G4250: 8.6%, G4275: 8.1%, G4349: 17.4%.

Copy number variants in single cells were determined using CONICSmat (Müller et al., 2018) with default parameters using the gene activity matrix generated by Signac as input. We focused on chr7 gains and chr10 losses as they are hallmark chromosomal changes in GBM and found the following fractions of cells carrying these CNVs, on average across the four tumors: 76% of all cells, 88% of cells allocated to scRNA-seq cellular states (Neftel et al., 2019), 95% of cancer stem cells based on the 19 gene signature, 91% of GSCs based on shared accessible regions between 14/27 GSC populations, and 94% of GSCs identified based on the state-specific signatures.

DNA methylation arrays

Request a detailed protocol

Bisulfite conversion of DNA for methylation profiling was performed using the EZ DNA Methylation kit (Zymo Research) on 500 ng genomic DNA from all 27 samples. Conversion efficiency was quantitatively assessed by quantitative PCR (qPCR). The Illumina Infinium MethylationEPIC BeadChips were processed as per manufacturer’s recommendations. The R package ChAMP v2.6.4 (Morris et al., 2014) was used to process and analyze the data. For the copy number analysis, the raw IDAT files were uploaded to the MNP tool (Capper et al., 2018), which directly compares the copy number profile estimated from the probe intensities on the methylation array to the distribution observed across thousands of brain tumors in its database.

RNA-seq

Request a detailed protocol

RNA was extracted from GSCs using the Qiagen RNeasy Plus kit. RNA sample quality was measured by Qubit (Life Technologies) for concentration and by Agilent Bioanalyzer for RNA integrity. All samples had RIN above 9. Libraries were prepared using the TruSeq Stranded mRNA kit (Illumina). Two hundred nanograms from each sample were purified for polyA tail containing mRNA molecules using poly-T oligo attached magnetic beads, then fragmented post-purification. The cleaved RNA fragments were copied into first strand cDNA using reverse transcriptase and random primers. This is followed by second strand cDNA synthesis using RNase H and DNA Polymerase I. A single ‘A’ base was added and adapter ligated followed by purification and enrichment with PCR to create cDNA libraries. Final cDNA libraries were verified by the Agilent Bioanalyzer for size and concentration quantified by qPCR. All libraries were pooled to a final concentration of 1.8 nM, clustered, and sequenced on the Illumina NextSeq500 as a pair-end 75 cycle sequencing run using v2 reagents to achieve a minimum of ~40 million reads per sample. Reads were aligned to hg19 using the STAR aligner v2.4.2a (Dobin et al., 2013), and transcripts were quantified using RSEM v1.2.21 (Li and Dewey, 2011) or vst transformed using DESeq2 (Anders and Huber, 2010).

Motif enrichment

Request a detailed protocol

Regions exclusively accessible in one of the GSC states and not the others were used as input sequences for the motif enrichment, while the full ATAC-seq catalog served as the background set when running HOMER v4.7 to detect enrichments of transcription factor binding motifs. Enriched motifs were then grouped into families based on similarities in DNA-binding domains using the CIS-BP database (Weirauch et al., 2014). Each family was assigned the fold-enrichment value of the most enriched motif within the family.

The transcription factors whose motifs were found enriched in Reactive-exclusive accessible regions were run together through GSEA (Subramanian et al., 2005), and the gene set corresponding to genes potentially regulated by SP1 was identified as significantly enriched (GSEA gene set GGGCGGR_SP1_Q6). The expression levels of key transcription factors in tumor and normal samples were analyzed and displayed using GEPIA (Tang et al., 2017).

Gene essentiality screen

Request a detailed protocol

Illumina sequencing reads from genome-wide TKOv1 CRISPR screens in patient-derived GSCs (MacLeod et al., 2019) were mapped using MAGECK (Li et al., 2014) and analyzed using the BAGEL algorithm with version two reference core essential genes/non-essential genes (Hart et al., 2017; Hart et al., 2015). Resultant raw Bayes factor (BF) statistics were used to determine essentiality of transcription factor genes using a minimum BF of 3 and a 5% false discovery rate cut-off. For visualization purposes only, the essentiality scores were scaled and the individual GSC essentiality thresholds subtracted from each score to obtain a common threshold at 0 across GSCs. In the GSEA analysis of essentiality genes, only those genes found essential only in a given GSC state were used, giving sets of 12, 29, and 235 genes for the Reactive, Constructive, and Invasive GSC states, respectively.

Orthotopic xenografts

Request a detailed protocol

All animal procedures were performed according to and approved by the Animal Care Committee of the Hospital for Sick Children or the University of Calgary. All attempts are made to minimize the handling time during surgery and treatment so as not to unduly stress the animals. Animals are observed daily after surgery to ensure there are no unexpected complications. For intracranial xenografts, 100,000 GSC cells were stereotactically injected into the frontal cortex of 6–8 weeks old female NOD/SCID or C17/SCID mice. Mice were monitored and euthanized once neurological symptoms were observed or at the experimental end point of 12 months.

Invasion assay

Request a detailed protocol

GSCs and human fetal neural stem cells (hfNSCs; used as controls) were seeded in flasks maintained vertically to limit adherence and incubated at 37°C, 5% CO2 until the average neurosphere size reached approximately 150 μm. Neurospheres were then collected and allowed to settle by gravity to the bottom of a prechilled 1.5 mL conical tubes for 5 min on ice. Following aspiration of the supernatant, spheres were re-suspended in 0.4 mg/mL type I collagen (Cultrex Rat Collagen I, Trevigen, 3440–005) on ice. The suspended spheres were dispensed in prechilled 96-well plates (100 μL/well). The plates were maintained on ice for 5 min to allow spheres to settle at the same level at the bottom of the well and then transferred to the incubator at 37°C for 15 min to allow polymerization of the collagen. Plates were then placed in an IncuCyte (Essen Bioscience, Ann Arbor, MI), and invasion was monitored every hour for 12 hr. To quantify invasion of the cells from the embedded spheres into the collagen matrix, the area of the spheres at each time point was normalized to the area of the spheres at T0. Invasion experiments were performed at least in triplicates for each GSC and hfNSC lines. For each replicate, invasion was measured based on a minimum of three spheres.

Immunohistochemistry

Request a detailed protocol

Tissue samples were formalin fixed and paraffin embedded. Serial sections deparaffinized and rehydrated through an alcohol gradient to water, and antigen retrieval in citrate buffer pH 6.0 was used for the human nucleolin antibody at 5.0 g/mL (ab13541) (Abcam, Cambridge, MA). Endogenous peroxide activity and nonspecific binding were blocked with 3%(vol/vol) peroxide and 2% (vol/vol) normal horse serum. Primary antibody and anti-mouse ImmPRESS-HRP secondary antibody were incubated for 1 hr and visualized using 3,3′-diaminobenzidine (Vectorlabs, Burlingame, CA). Normal horse serum or monoclonal IgM was used in control sections.

Survival analysis

Request a detailed protocol

Survival analysis on xenografts and TCGA data was performed using R packages survival (Therneau, 2015) and survminer (‘Drawing Survival Curves using ‘ggplot2’ [R package survminer, 2020]’, n.d.). The LogRank test was used in every analysis. See ATAC-seq section for details on how each GSC used in the orthotopic xenografts was assigned to a GSC state. For each of the 37 GSCs used in the xenograft survival analysis, the median survival value was used from multiple mice injected with each GSC. Each GSC was injected into one to six mice (mean number of mice used = 5; only one GSC has a single replicate: G489, a Constructive state GSC).

TCGA samples for this analysis were selected as follows: 144 IDH-wt adult GBM samples for which RNA-seq, IDH mutation status, and survival information were available. TCGA samples were assigned to individual GSC states in the following ways. (1) Using the unsupervised clustering of RNA-seq data presented in Figure 3B, the 23/27 GSCs that displayed matched GSC state assignments by RNA-seq and ATAC-seq were used in this analysis. (2) Genes preferentially enriched in each GSC state were determined using DEseq2 (Anders and Huber, 2010) (q ≤ 0.05 and fold change ≥ 2). (3) The mean log2(FPKM+1) value for each of these genes over all GSCs in each state was calculated to obtain a single representative value for each gene in each of the three GSC states. (4) The concordance index was then calculated between each TCGA sample and each GSC state, and individual TCGA samples were assigned to the GSC state with the highest score. Similarly, to assign TCGA samples to the three Invasive groups (Invasive-low, -mid, and -high), the concordance to Invasive GSCs as calculated above was used. The z-score for each sample was then used to classify each TCGA sample into the three subgroups of Invasive-low (Invasive z-score < 1), Invasive-mid (Invasive z-score = 1–1.65), and Invasive-high (Invasive z-score ≥ 1.65). When changing the Invasive z-score thresholds for grouping the TCGA samples, the most Invasive-high subgroup remains associated with the lowest survival (Figure 5—figure supplement 2B,C).

A further 158 samples that did not overlap with the RNA-seq set for which microarray expression data, survival information, and IDH status were available were used in a validation set. Individual samples were assigned to GSC clusters or C3 score classes as described above. The scores were then combined with those of the RNA-seq set for the survival analysis shown in Figure 5—figure supplement 2D–E.

Data availability

The GSCs are available upon reasonable request from PBD and SW. The GSC ATAC-seq and DNA methylation data have been deposited at GEO (GSE109399). The scATAC-seq data has been deposited at GEO (GSE139136). RNA-seq data are available at EGA (EGAS00001003070).

The following data sets were generated
    1. Guilhamon P
    2. Kushida MM
    3. Dirks PB
    4. Lupien M
    (2019) NCBI Gene Expression Omnibus
    ID GSE109399. Epigenetic characterization of glioblastoma stem cells.

References

    1. Capper D
    2. Jones DTW
    3. Sill M
    4. Hovestadt V
    5. Schrimpf D
    6. Sturm D
    7. Koelsche C
    8. Sahm F
    9. Chavez L
    10. Reuss DE
    11. Kratz A
    12. Wefers AK
    13. Huang K
    14. Pajtler KW
    15. Schweizer L
    16. Stichel D
    17. Olar A
    18. Engel NW
    19. Lindenberg K
    20. Harter PN
    21. Braczynski AK
    22. Plate KH
    23. Dohmen H
    24. Garvalov BK
    25. Coras R
    26. Hölsken A
    27. Hewer E
    28. Bewerunge-Hudler M
    29. Schick M
    30. Fischer R
    31. Beschorner R
    32. Schittenhelm J
    33. Staszewski O
    34. Wani K
    35. Varlet P
    36. Pages M
    37. Temming P
    38. Lohmann D
    39. Selt F
    40. Witt H
    41. Milde T
    42. Witt O
    43. Aronica E
    44. Giangaspero F
    45. Rushing E
    46. Scheurlen W
    47. Geisenberger C
    48. Rodriguez FJ
    49. Becker A
    50. Preusser M
    51. Haberler C
    52. Bjerkvig R
    53. Cryan J
    54. Farrell M
    55. Deckert M
    56. Hench J
    57. Frank S
    58. Serrano J
    59. Kannan K
    60. Tsirigos A
    61. Brück W
    62. Hofer S
    63. Brehmer S
    64. Seiz-Rosenhagen M
    65. Hänggi D
    66. Hans V
    67. Rozsnoki S
    68. Hansford JR
    69. Kohlhof P
    70. Kristensen BW
    71. Lechner M
    72. Lopes B
    73. Mawrin C
    74. Ketter R
    75. Kulozik A
    76. Khatib Z
    77. Heppner F
    78. Koch A
    79. Jouvet A
    80. Keohane C
    81. Mühleisen H
    82. Mueller W
    83. Pohl U
    84. Prinz M
    85. Benner A
    86. Zapatka M
    87. Gottardo NG
    88. Driever PH
    89. Kramm CM
    90. Müller HL
    91. Rutkowski S
    92. von Hoff K
    93. Frühwald MC
    94. Gnekow A
    95. Fleischhack G
    96. Tippelt S
    97. Calaminus G
    98. Monoranu CM
    99. Perry A
    100. Jones C
    101. Jacques TS
    102. Radlwimmer B
    103. Gessi M
    104. Pietsch T
    105. Schramm J
    106. Schackert G
    107. Westphal M
    108. Reifenberger G
    109. Wesseling P
    110. Weller M
    111. Collins VP
    112. Blümcke I
    113. Bendszus M
    114. Debus J
    115. Huang A
    116. Jabado N
    117. Northcott PA
    118. Paulus W
    119. Gajjar A
    120. Robinson GW
    121. Taylor MD
    122. Jaunmuktane Z
    123. Ryzhova M
    124. Platten M
    125. Unterberg A
    126. Wick W
    127. Karajannis MA
    128. Mittelbronn M
    129. Acker T
    130. Hartmann C
    131. Aldape K
    132. Schüller U
    133. Buslei R
    134. Lichter P
    135. Kool M
    136. Herold-Mende C
    137. Ellison DW
    138. Hasselblatt M
    139. Snuderl M
    140. Brandner S
    141. Korshunov A
    142. von Deimling A
    143. Pfister SM
    (2018) DNA methylation-based classification of central nervous system tumours
    Nature 555:469–474.
    https://doi.org/10.1038/nature26000
    1. O'Connor L
    2. Gilmour J
    3. Bonifer C
    (2016)
    The role of the ubiquitously expressed transcription factor Sp1 in Tissue-specific transcriptional regulation and in disease
    The Yale Journal of Biology and Medicine 89:513–525.
  1. Software
    1. Therneau T
    (2015)
    A Package for Survival Analysis in S, version 2.38
    A Package for Survival Analysis in S.

Decision letter

  1. Lynne-Marie Postovit
    Reviewing Editor; University of Alberta, Canada
  2. Kevin Struhl
    Senior Editor; Harvard Medical School, United States
  3. Roel Verhaak
    Reviewer; Jackson Laboratory, United States

In the interests of transparency, eLife publishes the most substantive revision requests and the accompanying author responses.

Acceptance summary:

The current manuscript by Guilhamon et al. uses single cell sequencing technologies to identify and then characterize heterogeneities among glioblastoma (GBM) stem cells (GSC), which extend beyond current disease classifications. This is the first study to apply single cell sequencing specifically to GSCs with the goal of identifying GSC subtypes using primarily chromatin accessibility signatures. Moreover, the use of chromatin accessibility signatures in combination with TF motif prediction to guide identification of transcriptional dependencies is novel in the context of GSC studies.

Decision letter after peer review:

[Editors’ note: the authors submitted for reconsideration following the decision after peer review. What follows is the decision letter after the first round of review.]

Thank you for submitting your work entitled "Single-cell chromatin accessibility in glioblastoma delineates cancer stem cell heterogeneity predictive of survival" for consideration by eLife. Your article has been reviewed by three peer reviewers, one of whom is a member of our Board of Reviewing Editors, and the evaluation has been overseen by a Senior Editor. The following individual involved in review of your submission has agreed to reveal their identity: Roel Verhaak (Reviewer #3).

Our decision has been reached after consultation between the reviewers. Based on these discussions and the individual reviews below, we regret to inform you that your work will not be considered further for publication in eLife, due to the need for revisions that would likely take longer than 2-3 months. We would however encourage resubmission once the revisions are completed.

Summary

The current manuscript by Guilhamon et al. uses single cell sequencing technologies to identify and then characterize heterogeneities among glioblastoma (GBM) stem cells (GSC), which extend beyond current disease classifications. The paper was reviewed by three experts, who all agreed that the work will add meaningful knowledge to the field; but that more experimentation would be needed to better support the claims as made. The reviewers agreed that this is the first study to apply single cell sequencing specifically to GSCs with the goal of identifying GSC subtypes using primarily chromatin accessibility signatures. In particular, the use of chromatin accessibility signatures in combination with TF motif prediction to guide identification of transcriptional dependencies is novel in the context of GSC studies. However, it was felt that a number of enhancements would be needed, including greater validation of the cell samples, the inclusion of additional samples, and clarification of some of the methodologies employed. We encourage resubmission once these issues have been addressed.

Major concerns:

1) The tumor cell and GSC population enrichment should be validated using functional and genomic methods (including evidence for either Chr7 amplification or Chr10 deletion), and additional patient samples should be analyzed, preferably without extended culture times. Indeed, there appeared to be a low correlation between GBM stem TF enrichment and GSC signature enrichment and the degree of concordance between samples was not fully developed. It would be important to have more samples for many reasons, including the idea presented by the authors of differences between tumors in the dominant cell type. GBM has significant variation spatially, as well, so having only 4 tumors seems underpowered.

2) Details about the tumor collection should be included as should full information about tumor genetics, patient information, and treatment for each sample.

3) Figure 1 should be revised to clarify cells which fail to classify and to signify whether shared chromatin states exist across patients.

4) Figure 3 should be revised. Data pertaining to GSCs should be compared to the published GBM scRNAseq data to check the subtypes (Neftel et al., 2019, etc.). In addition, the top and bottom panels could be combined, and the sample assignment of TCGA subtypes, chromatin clusters and expression clusters should be better presented in the sankey diagram. The GSC samples and color bars showing chromatin clusters should be kept consistent between the top and the bottom panels. Figure 3A is missing a legend as there is no indication of what the intensity values represent. For Figure 3B, please clearly indicate an associated P-value testing for the enrichment of TCGA subtypes among the GSC ATAC states. Figure 3C should present cut-offs for what was considered statistically significant (i.e., a dotted line). Importantly, "Figure 3—figure supplement 1" presents the most differentially accessible regions between GSC states. Notably, ChrX is present in all. Chromatin accessibility read-outs might be significantly impacted by the presence of an inactive X chromosome. Hence an imbalance in the sex distribution in the GSC states might pick up sex-specific and not tumor-specific differences. Thus, the authors should show that these regions do not vary with patient sex. Finally, the evidence that "GSC states are not genetically defined (Figure 3D)" is derived from DNA methylation copy number, and a signature of selected genes. Bona fide GBM genes such as PDGFRA, NF1, AKT3, QKI should also be included in this analysis and the possibility that the GSC states cluster by frequent chromosome arm-level events (chr7, chr19, chr20 gains and chr10 loss) should be entertained. If they still do not cluster by GSC state, then it is perhaps more accurate to say that "GSC states are not defined by somatic copy number events" since there is no gene mutation data presented for these samples.

5) The validation of the states in the TCGA data are not strong. The cut-offs seem arbitrary and most of the samples seem to fall into a single group. Is this due to sampling bias? There is no connection to tumor genetics. Also, the bulk GBM ATACseq tumor profiles were generated in the TCGA datasets that presumably have accompanying RNA, DNA, and methylation data (Corces et al., Science 2018). While this represents a small dataset, it would be advantageous to incorporate analysis of these more complex tissue samples alongside the already presented data to provide confidence in generalizability. The concordance between RNA-seq and ATAC-seq data for the paired datasets should also be presented.

6) Invasion assays should include all subtypes in order to highlight properties of the invasive subtype.

7) In order to support the claim that the "Invasive-high" group confers a worse prognosis, a multivariate analysis should be employed if possible.

8) There are a number of clerical errors that should be addressed:

– Figure 4—figure supplement 3A is blank

– Please double check all sample names, in Figure 3B, G613 changes from invasive to reactive and there are several instances where the sample names do not match up.

– If other samples (outside of the 27 discussed in Figure 3) were used in Figure 4, it should be made clear. G472, G440, and G473 are not in Figure 3

– A table with all the sample names, GSC subtype identified through ATAC-seq, and validation of GSC should be included

– Figure 4—figure supplement 1 G is missing labels

– Please include the GSEA gene set size.

– Describe how many essentiality genes were included for GSEA genes

– Figure 4D should be revised to better represent the number of mice were used. A detailed experimental design and the survival curve analysis should also be provided.

– Figure 2—figure supplement 1 lacks sample annotation in both the image and the figure legend.

– Figure 4—figure supplement 3 was blacked out in the version provided for review.

– There appears to be a typographical error at the start of Results paragraph five, "FOXD1 is essential only in the Reactive state […]". This seems like it should be, "Invasive".

Reviewer #1:

The current manuscript by Guilhamon et al. uses single cell sequencing technologies to identify and then characterize heterogeneities among glioblastoma (GBM) stem cells (GSC), which extend beyond current disease classifications. Single cell technologies have already been applied to GBM in various publications. The identification of an invasive GSC subtype associated with poor survival is not novel, as it largely complements the previously identified mesenchymal GSC subtype. However, as far as I'm aware this is the first study to apply single cell sequencing specifically to glioblastoma stem cells (GSCs) with the goal of identifying GSC subtypes using primarily chromatin accessibility signatures. In particular, the use of chromatin accessibility signatures in combination with TF motif prediction to guide identification of transcriptional dependencies is novel in the context of GSC studies. Thus, this manuscript would constitute a meaningful contribution to the field of GBM and epigenetics.

– Validation of GSC population enrichment should be done and shown. Particularly because despite what is described in the text, Figure 2F show low correlation between GBM stem TF enrichment and GSC signature enrichment. 2 out of 4 primary GBM show very little correlation. Therefore, GSC enrichment should be validated with other methods, such as staining for known neural stem cell markers or clustering/enrichment/scoring based on validated GSC gene sets.

– Chromatin accessibility scores at promoter regions is used as a surrogate for gene expression for numerous analyses. What is the concordance between RNA-seq and ATAC-seq for the paired datasets? Particularly, what is the concordance for the subtype specific peaks? The concordance of gene expression and chromatin accessibility feels particularly important when exclusive peaks only account for ~5-15% of all peaks.

– Invasion assays should include all subtypes in order to highlight properties of the invasive subtype. Additionally, the author's conclusions would be strengthened by the combination of FOXD1 knockdown/inhibition with the invasion assays.

– Figure 4—figure supplement 3A is blank.

– Please double check all sample names, in Figure 3B, G613 changes from invasive to reactive. There are several instances where the sample names do not match up.

– If other samples (outside of the 27 discussed in Figure 3) were used in Figure 4, it should be made clear. G472, G440, and G473 are not in Figure 3.

– Please include table with all the sample names, GSC subtype identified through ATAC-seq, and validation of GSC.

– Figure 4—figure supplement 1 G is missing labels.

Reviewer #2:

In this manuscript, Guilhamon et al. describe a single cell ATACseq study in GBM patient tissues and bulk ATACseq in a panel of patient-derived GBM stem cell cultures. Using this dataset, the authors studied the three states in glioblastoma stem cell cultures, which presents some interesting findings to the biology of GBM. The dataset generated in this study is useful to the field. There are some concerns to be addressed:

The single cell data would seem to be potentially the most important, but the number of samples is small and the degree of concordance between samples was not fully developed. It would be important to have more samples for many reasons, including the idea presented by the authors of differences between tumors in the dominant cell type. GBM has significant variation spatially, as well, so having only 4 tumors seems underpowered.

There are missing details about the tumor collection. The three groups are designated to include reactive, which has immune cell signatures, and invasive. Often the tumor does not include areas of necrosis and invasion in most surgical specimens.

Why were the GSC cultured for 5-12 passages? This would seem to induce potential selection or change the cells. If the laminin conditions are optimal, why weren't the GSC studied immediately? It would be helpful to include support for the LDAs and xenograft studies for each of the lines.

Please include the full information about tumor genetics, patient information, and treatment for each sample.

There are many very strong statements made. It would seem premature to call single cells GSC only based on transcription. I am not aware that any profile is validated to claim absolutely to identify GSC. This is but one of the leaps that were made. I would ask the authors to be more careful and simply identify exactly what was measured.

There is no direct comparison of GBM tissues and GSC cultures. The conclusion that GSCs "reflect the chromatin identity of putative cancer stem cells found in primary brain tumors" should be supported by more evidence. For examples, direct comparison of some important drivers or markers at the chromatin level should be provided.

The analysis of GBM cancer stem cell signature and 4 cellular states is interesting. Can the authors explain why the putative cancer stem cells were under-represented in MES cell state? Did the authors consider or separate transcription factors or markers for cancer stem cells of different types in GBM? Is there any difference between stem and non-stem cells from the same cellular state?

Is there overlap between 4 GBM tissues with scATAC data and patients with GSC culture?

The description of the mismatch between chromatin and TCGA subtype is confusing (Figure 3B). According to the description, GSCs are believed to be a relatively pure population, can the authors compare GSCs to the published GBM scRNAseq data to check the subtypes (Neftel et al., 2019, etc.)? In addition, the top and bottom panels could be combined, and the sample assignment of TCGA subtypes, chromatin clusters and expression clusters should be better presented in the sankey plot. The GSC samples and color bars showing chromatin clusters are not consistent between the top and the bottom panels, making it really difficult to understand.

It's interesting that many GSCs express immune cell related genes, which should be further explained, as the enrichment significance (FDR) is not very strong.

The differentially expressed gene analysis between three GSC states should be performed to provide state markers to have a better understanding of cellular states.

Figure 4D is difficult to understand. What's the meaning of the n? How many mice were used in the experiment? A detailed experimental design and the survival curve analysis should be provided.

The CRISPR screen would be beneficial to validate, including additional direct studies of specific targets in a larger cohort of the GSC.

The validation of the states in the TCGA data are not strong. The cutoffs seem arbitrary and most of the samples seem to fall into a single group. Is this due to sampling bias? There is no connection to tumor genetics.

Reviewer #3:

Guilhamon et al. performed single-cell ATACseq in four primary IDH-wildtype glioblastomas to derive chromatin accessibility signatures of heterogeneous glioma stem cell populations. The scATACseq primary tumor analysis suggested the existence of three major modules: Reactive, Constructive, and Invasive that define glioma stem populations and are present to varying degrees within a single tumor. The module/cell state relevance is supported by leveraging new and existing functional assay data in sets of patient-derived GSC populations. The authors then demonstrate that transcription factors nominated by DNA recognition motif enrichment in each GSC state are preferentially essential through analysis of a recently published genome-wide CRISPR/Cas9 essentiality screen. Finally, evidence is presented to suggest that these chromatin signatures are associated with survival in a xenograft model and TCGA subjects.

Overall, this study addresses an important biological question of glioblastoma stem cell heterogeneity and is one of the earliest papers investigating scATAC glioblastoma profiles. However, the conclusions drawn are not fully supported by the results and in some cases the technical details of the experiment/analysis are lacking. The following suggestions are intended to help strengthen this work.

1) The clarity of the paper would be aided by additional technical descriptions throughout the manuscript. For example:

a) The four GBM samples profiled using scATACseq were classified using the Neftel signatures, on the basis of accessibility of signature gene promoters. This approach will bin individual cells whether the classification is accurate, or not. How can we be sure that this approach accurately provides Neftel classification?

b) The same classification method is suggested to result in greater stratification of the Neftel MES category. This result may be confounded by the presence of tumor-associated microglia that will be classified as mesenchymal. Were all classified cells tumor cells and how was this validated?

c) It was not fully explained why single cells without evidence for either Chr7 amplification or Chr10 deletion were included in further analyses regarding glioma stem cell populations. The cells that were not labelled by "GBM CNV+" (Figure 2—figure supplement 1) also seem to less frequently be classified as "STEM" (Figure 1D, especially G4250 and G4275). Are these cells of poorer quality or do they represent non-tumor cell types? It would also be helpful to know whether the authors enriched for tumor cells during the dissociation process (otherwise one would expect non-tumor cells too), the number of nuclei that were loaded for 10X, basic quality control metrics for each sample, and if cells identified as non-tumor were excluded from construction of the chromatin modules.

2) Figure 1 was challenging to follow with the multiple annotations of the patient UMAPs. For instance, in panels 1B and 1D the original numbered clusters remain for some not all cells. Are these cells that failed to confidently classify as cell states (i.e., AC, MES, NPC, OPC) or stem states? Perhaps consider "graying" out cells that fail to classify and explain why not all tumor cells might have a cell state classification would benefit the audience. I could not find mention of this in the Results or Materials and methods sections. Also, what happens when all cells from the four subjects are analyzed together in a single analysis? Do shared chromatin states that exist across patients emerge or is there highly specific patient clustering that reflects inter-patient heterogeneity? Panel C in this figure was also difficult to understand. None of the modules equal to 100%, is this because some cells failed to classify?

3) Bulk GBM ATACseq tumor profiles were generated in the TCGA dataset that presumably have accompanying RNA, DNA, and methylation data (Corces et al. Science 2018). While this represents a small dataset, it would be advantageous to incorporate analysis of these more complex tissue samples alongside the already presented data to provide confidence in generalizability.

4) Figure 3 appears to be missing some critical information. Figure 3A is missing a legend as there is no indication what the intensity values represent. For Figure 3B, is there an associated P-value testing for the enrichment of TCGA subtypes among the GSC ATAC states? Figure 3C could present cut-offs for what was considered statistically significant (i.e., a dotted line). Importantly, "Figure 3—figure supplement 1" presents the most differentially accessible regions between GSC states. While it is difficult to read the all the regions it's notable that ChrX is present in all. I imagine that chromatin accessibility read-outs might be significantly impacted by the presence of an inactive X chromosome. If there is an imbalance in the sex distribution in the GSC states, then this might skew to picking up sex-specific and not tumor-specific differences. Have the authors shown that these regions do not vary with patient sex.

5) The set of GSCs making up the Invasive subtype are all classified as Mesenchymal by gene expression (Figure 3B), consider calling it as such for consistency?

6) The evidence that "GSC states are not genetically defined (Figure 3D)" is derived from DNA methylation copy number, and a signature of selected genes. Is this sufficient? What about bona fide glioblastoma genes such as PDGFRA, NF1, AKT3, QKI? Did DNA copy number profiles align with transcriptional subtypes? Do the GSC states cluster by frequent chromosome arm-level events (chr7, chr19, chr20 gains and chr10 loss)? If they still do not cluster by GSC state, then it is perhaps more accurate to say that "GSC states are not defined by somatic copy number events" since there is no gene mutation data presented for these samples.

7) More than one publication has speculated to be able to stratify IDH wild type glioblastoma into prognostically relevant groups, i.e. PMID 30753603. For a tumor type with a single treatment that is not very effective, it is unclear that predicting prognosis has relevance.

8) In a multi-variate model that includes known prognostic factors (i.e., subject age and MGMT promoter methylation status) does the "Invasive-high" group still confer a worse prognosis? This would aid the author's claim that the signature is "predictive of low patient survival" independent on known prognostic factors.

9) Single-nucleus ATACseq of IDH mutant glioma has been reported (PMID 31806013) and there is some overlap in the gene sets that were identified to associate with cell states and survival, and it would be appropriate to cite this literature.

[Editors’ note: further revisions were suggested prior to acceptance, as described below.]

Thank you for resubmitting your work entitled "Single-cell chromatin accessibility of glioblastoma identifies invasive cancer stem cell linked to lower survival" for further consideration by eLife. Your revised article has been evaluated by Kevin Struhl (Senior Editor) and a Reviewing Editor.

The manuscript has been improved but there are some remaining issues that need to be addressed before acceptance, as outlined below:

The current manuscript by Guilhamon et al. uses single cell sequencing technologies to identify and then characterize heterogeneities among glioblastoma (GBM) stem cells (GSC), which extend beyond current disease classifications. The paper will add meaningful knowledge to the field and revisions have enhanced this manuscript. For example, validation of the cells was provided, methodologies were clarified, and impressive data related to the invasive subtype was provided. Additional samples would have enhanced the study, but were difficult to obtain given the COVID-19 pandemic. Overall, though, this would be the first study to apply single cell sequencing specifically to GSCs with the goal of identifying GSC subtypes using primarily chromatin accessibility signatures. Moreover, the use of chromatin accessibility signatures in combination with TF motif prediction to guide identification of transcriptional dependencies is novel in the context of GSC studies. These strengths overcome weaknesses associated with small sample numbers.

1) Please clarify what percentage of cells were estimated to be stem cells, among the scATACseq datasets.

2) Please modify the statement "A more likely possibility is that given the TCGA subtypes were determined from bulk GBM, they may not fully capture the nature of rarer populations found within a tumor, such as the cancer stem cell populations." Due to the amount of testing needed to test this, please modify this explanation to be broader and to also include the possibility that this is due at least in part to differences in microenvironments.

3) Please modify the statement "No molecular signature in GBM has so far been reported that can significantly stratify the poorer prognosis IDH wild-type patients by survival." This is incorrect. DNA methylation profiling classifies IDH wild type tumors into several categories including Mesenchymal-like (Ceccarelli et al., Cell, 2016), which shows significantly worse outcomes even among only MGMT-unmethylated tumors.

4) Regarding the statement "Using the TCGA IDH wild type cohort (n=144)", please clarify that: a) this cohort relates to the subset of GBM IDH wildtype cases with available RNAseq data (presumably); and b) that the GSC states were converted from a chromatin accessibility signature to a RNA expression signature.

5) There is no obvious reason why the TCGA IDH wildtype cohort could not also be classified on the basis of the available Affymetrix microarray data, which would enable analysis of a substantially larger cohort in pursuit of a survival difference (n=400). Please include this analysis.

6) Please include a small paragraph in the Discussion about the limitations of the study; particularly as they relate to a potentially small samples size.

Reviewer #3:

A few remaining comments with respect to the revision:

1) It would be helpful to clarify what percentage of cells were estimated to be stem cells, among the scATACseq datasets.

2) "A more likely possibility is that given the TCGA subtypes were determined from bulk GBM, they may not fully capture the nature of rarer populations found within a tumor, such as the cancer stem cell populations." While in this reviewer's opinion the most likely explanation is the presence/absence of a tumor microenvironment, we won’t know without a substantial amount of extra testing, which is beyond the scope. Please modify as to keep the explanations broad.

3) "No molecular signature in GBM has so far been reported that can significantly stratify the poorer prognosis IDH wild-type patients by survival." This is incorrect. DNA methylation profiling classifies IDH wild type tumors into several categories including Mesenchymal-like (Ceccarelli et al., Cell, 2016), which shows significantly worse outcomes even among only MGMT-unmethylated tumors.

4) "Using the TCGA IDH wild type cohort (n=144)" it would help to clarify that a. this cohort relates to the subset of GBM IDH wildtype cases with available RNAseq data (presumably); and b. that the GSC states were converted from a chromatin accessibility signature to a RNA expression signature.

5) There is no obvious reason why the TCGA IDH wildtype cohort could not also be classified on the basis of the available Affymetrix microarray data, which would enable analysis of a substantially larger cohort in pursuit of a survival difference (n=400).

https://doi.org/10.7554/eLife.64090.sa1

Author response

[Editors’ note: the authors resubmitted a revised version of the paper for consideration. What follows is the authors’ response to the first round of review.]

Major concerns:

1) The tumor cell and GSC population enrichment should be validated using functional and genomic methods (including evidence for either Chr7 amplification or Chr10 deletion), and additional patient samples should be analyzed, preferably without extended culture times. Indeed, there appeared to be a low correlation between GBM stem TF enrichment and GSC signature enrichment and the degree of concordance between samples was not fully developed. It would be important to have more samples for many reasons, including the idea presented by the authors of differences between tumors in the dominant cell type. GBM has significant variation spatially, as well, so having only 4 tumors seems underpowered.

Evidence of chr7 and chr10 alterations at the single cell level can be found in Figure 2—figure supplement 1A-D, with details presented in the Materials and methods section “Single cell ATAC-seq”, and in the main text. To further explore the concordance between samples in response to this and other comments below, we also annotated single cells that failed to classify into one of the Neftel et al. subtypes, and the number of tumours sharing UMAP modules of each state. These data can be found in Figure 1—figure supplement 1A-B, and in the main text.

We also conducted additional analyses into the concordance of stem cell calls by the two signatures (GSCs and the 19 stem TF signature by Suva et al., 2014). We found that the overlap in cells significantly enriched for either stem signature across the four tumours was significant (hypergeometric test p-value = 8.8 x 10−8), and this has been added to the main text.

Restrictions due to the covid-19 pandemic prevented us from obtaining and processing new samples for scATAC-seq. We however attempted to incorporate new samples into our single-cell analysis using publicly available data from the Wang et al. (Cancer Discovery, 2019) study. That study contained four IDH-wt GBM samples processed for scATAC-seq, of which three had raw data available (one sample’s data was corrupted on the EGA archive). Unfortunately, only one of the three samples had more than 350 cells available after quality control, and that sample had only 19% sequencing saturation (as compared to ~75% in our samples). Under these circumstances it was unfeasible to integrate these data into our study and perform on them the same analyses used on our samples.

Collectively, the data presented in this manuscript represent the most comprehensive study to date of single cell chromatin accessibility in glioblastoma and, as highlighted in your summary above, provide novel insights and a valuable resource to the community.

2) Details about the tumor collection should be included as should full information about tumor genetics, patient information, and treatment for each sample.

We have included details of tumor collection for those samples used in the scATAC-seq as well as known patient characteristics (IDH status, sex, and age) in the Materials and methods section “Single cell ATAC-seq”. Details on sample collection for the GSCs are included in the Materials and methods section “Patient samples and cell culture”. Details on the GSC state of each GSC are in Supplementary Table 1.

3) Figure 1 should be revised to clarify cells which fail to classify and to signify whether shared chromatin states exist across patients.

We have added that information in the form of stacked bar plots in Figure 1—figure supplement 1A.

4) Figure 3 should be revised. Data pertaining to GSCs should be compared to the published GBM scRNAseq data to check the subtypes (Neftel et al., 2019, etc.). In addition, the top and bottom panels could be combined, and the sample assignment of TCGA subtypes, chromatin clusters and expression clusters should be better presented in the sankey diagram. The GSC samples and color bars showing chromatin clusters should be kept consistent between the top and the bottom panels. Figure 3A is missing a legend as there is no indication of what the intensity values represent. For Figure 3B, please clearly indicate an associated P-value testing for the enrichment of TCGA subtypes among the GSC ATAC states. Figure 3C should present cut-offs for what was considered statistically significant (i.e., a dotted line). Importantly, "Figure 3—figure supplement 1" presents the most differentially accessible regions between GSC states. Notably, ChrX is present in all. Chromatin accessibility read-outs might be significantly impacted by the presence of an inactive X chromosome. Hence an imbalance in the sex distribution in the GSC states might pick up sex-specific and not tumor-specific differences. Thus, the authors should show that these regions do not vary with patient sex. Finally, the evidence that "GSC states are not genetically defined (Figure 3D)" is derived from DNA methylation copy number, and a signature of selected genes. Bona fide GBM genes such as PDGFRA, NF1, AKT3, QKI should also be included in this analysis and the possibility that the GSC states cluster by frequent chromosome arm-level events (chr7, chr19, chr20 gains and chr10 loss) should be entertained. If they still do not cluster by GSC state, then it is perhaps more accurate to say that "GSC states are not defined by somatic copy number events" since there is no gene mutation data presented for these samples.

Following the reviewer’s recommendations we have combined the top and bottom panels of Figure 3B, and corrected the sample labelling on the heatmap. We thank the reviewers for pointing out that error. We have additionally added a legend to both heatmaps on Figure 3A and 3B and added the GSEA Enrichment Score range to Figure 3B.

In Figure 3C only significant (q-value <=0.05 after multiple test correction) terms are displayed. We have included this in the revised figure legend.

The reviewers correctly point out that sex imbalance between groups could have been a concern; we had verified there was none early in the study but failed to discuss it in the original version of this manuscript. This is now included in the Materials and methods section “ATAC-seq”.

On the reviewers’ recommendation, we have extensively updated Figure 3D, including additional samples, as well as more GBM-related genes and the large genomic events most commonly found in GBM, including on chromosomes 7,10,19 and 20. The conclusions from this analysis are maintained and we have amended the text as suggested.

5) The validation of the states in the TCGA data are not strong. The cut-offs seem arbitrary and most of the samples seem to fall into a single group. Is this due to sampling bias? There is no connection to tumor genetics. Also, the bulk GBM ATACseq tumor profiles were generated in the TCGA datasets that presumably have accompanying RNA, DNA, and methylation data (Corces et al. Science 2018). While this represents a small dataset, it would be advantageous to incorporate analysis of these more complex tissue samples alongside the already presented data to provide confidence in generalizability. The concordance between RNA-seq and ATAC-seq data for the paired datasets should also be presented.

The cut-offs and methods used in the TCGA state labelling are detailed in the Materials and methods section “Survival Analysis”. They were set at z-score of 1.65 (which corresponds to a p-value of 0.05) for the Invasive-high to identify those patients with a significant enrichment of the Invasive GSC signature. For the Invasive-mid patients, we chose a z-score between 1 and 1.65 in order to identify those patients who displayed an important enrichment of the signature even though it did not reach significance. In addition, we show in Figure 5—figure supplement 2B-C, that changing these thresholds does not alter the conclusions of the analysis.

While we appreciate the suggestion to use the Corces et al. datasets to match expression and chromatin accessibility, the 9 GBM ATAC-seq samples used in that study do not have RNA-seq data available. This can be verified in the supplementary table of the study as well as directly on the GDC data portal.

6) Invasion assays should include all subtypes in order to highlight properties of the invasive subtype.

We have performed numerous additional invasion assays to confirm the invasive properties of the invasive GSC state. Using a minimum of three representative GSC lines from each of the three GSC states as well as two human fetal neural stem cells (hfNSCs) as controls, we performed collagen invasion assays. For each GSC, the assay was repeated in triplicate, and at least 3 spheres were measured for each replicate. The results confirm that Invasive state GSCs are over 3-fold more invasive than GSCs from the other states or the control hfNSCs. These data can be found in Figure 5 B-C, Figure 5—figure supplement 1, in the Materials and methods section “Invasion assays”, and in the main text.

7) In order to support the claim that the "Invasive-high" group confers a worse prognosis, a multivariate analysis should be employed if possible.

With this analysis, we were investigating whether a particular GSC subtype enriched in a given tumour might be associated with survival. To do this we calculated the hazard ratios of the different GSC states both in the xenografts and TCGA data as shown in the Results. As no other molecular signature in GBM has so far been reported that can significantly stratify the poorer prognosis IDH wild-type patients by survival, and as using profiles from GSCs specifically has not been attempted before, there are no known covariates we could integrate into this analysis.

8) There are a number of clerical errors that should be addressed:

– Figure 4—figure supplement 3A is blank

– Please double check all sample names, in Figure 3B, G613 changes from invasive to reactive and there are several instances where the sample names do not match up.

– If other samples (outside of the 27 discussed in Figure 3) were used in Figure 4, it should be made clear. G472, G440, and G473 are not in Figure 3

– A table with all the sample names, GSC subtype identified through ATAC-seq, and validation of GSC should be included

– Figure 4—figure supplement 1 G is missing labels

– Please include the GSEA gene set size.

– Describe how many essentiality genes were included for GSEA genes

– Figure 4D should be revised to better represent the number of mice were used. A detailed experimental design and the survival curve analysis should also be provided.

– Figure 2—figure supplement 1 lacks sample annotation in both the image and the figure legend.

– Figure 4—figure supplement 3 was blacked out in the version provided for review.

– There appears to be a typographical error at the start of Results paragraph five, "FOXD1 is essential only in the Reactive state […]". This seems like it should be, "Invasive".

We are grateful to the reviewers for pointing out these errors. We have corrected them as follows:

– The invasion data is no longer in Figure 4—figure supplement 3A but in Figure 5 and Figure 5—figure supplement 1.

– The labels in Figure 3B have been corrected

– Additional samples used beyond the 27 discussed in Figures 2 and 3 are outlined in the Materials and methods section “ATAC-seq”

– A table with sample names and GSC states has been added as Supplementary file 1

– Labels have been added to Figure 4—figure supplement 1G

– The GSEA gene set sizes have been added to the figure legend

– Essentiality gene numbers used in the GSEA analysis have been added to the Materials and methods section “Gene Essentiality Screen”

– We have added the number of mice used for each of the 37 GSCs in the Materials and methods section “Survival Analysis” and in the figure legend. An average of 5 mice were used for each GSC.

– Annotation has been added to Figure 2—figure supplement 1 in the image and figure

– As mentioned above, the invasion data is no longer in Figure 4 but in Figure 5 and Figure 5—figure supplement 1.

– The typographical error related to FOXD1 has been corrected in the main text.

[Editors’ note: what follows is the authors’ response to the second round of review.]

The current manuscript by Guilhamon et al. uses single cell sequencing technologies to identify and then characterize heterogeneities among glioblastoma (GBM) stem cells (GSC), which extend beyond current disease classifications. The paper will add meaningful knowledge to the field and revisions have enhanced this manuscript. For example, validation of the cells was provided, methodologies were clarified, and impressive data related to the invasive subtype was provided. Additional samples would have enhanced the study, but were difficult to obtain given the COVID-19 pandemic. Overall, though, this would be the first study to apply single cell sequencing specifically to GSCs with the goal of identifying GSC subtypes using primarily chromatin accessibility signatures. Moreover, the use of chromatin accessibility signatures in combination with TF motif prediction to guide identification of transcriptional dependencies is novel in the context of GSC studies. These strengths overcome weaknesses associated with small sample numbers.

1) Please clarify what percentage of cells were estimated to be stem cells, among the scATACseq datasets.

This information has now been added within the Results section as well as within the Materials and methods (Single Cell Atac section).

2) Please modify the statement "A more likely possibility is that given the TCGA subtypes were determined from bulk GBM, they may not fully capture the nature of rarer populations found within a tumor, such as the cancer stem cell populations." Due to the amount of testing needed to test this, please modify this explanation to be broader and to also include the possibility that this is due at least in part to differences in microenvironments.

We have now broadened the statement in the main text and included the possibility of the absence of a tumor microenvironment in the GSC populations. This change can be found in the Results section.

3) Please modify the statement "No molecular signature in GBM has so far been reported that can significantly stratify the poorer prognosis IDH wild-type patients by survival." This is incorrect. DNA methylation profiling classifies IDH wild type tumors into several categories including Mesenchymal-like (Ceccarelli et al., Cell, 2016), which shows significantly worse outcomes even among only MGMT-unmethylated tumors.

In our own review of the manuscript cited by the reviewer, we could not find data that directly supports the identification of a subset of adult IDH-wt GBM tumors with a worse prognosis, particularly as the cohorts used mixed both low-grade gliomas and GBM as well as pediatric and adult samples. However, in order to clarify the statement in the manuscript, we have modified it to reflect that this claim applies only to IDH-wt adult GBM and phrased as follows “molecular signature in adult GBM stratifying the poorer prognosis IDH wild-type patients by survival are lacking.”

4) Regarding the statement "Using the TCGA IDH wild type cohort (n=144)", please clarify that: a) this cohort relates to the subset of GBM IDH wildtype cases with available RNAseq data (presumably); and b) that the GSC states were converted from a chromatin accessibility signature to a RNA expression signature.

We have expanded the Materials and methods section related to the Survival Analysis to include these clarifications and specify how the TCGA cohort was identified as well as the full details of the conversion between the chromatin accessibility and RNA expression signatures.

5) There is no obvious reason why the TCGA IDH wildtype cohort could not also be classified on the basis of the available Affymetrix microarray data, which would enable analysis of a substantially larger cohort in pursuit of a survival difference (n=400). Please include this analysis.

We thank the reviewer for this suggestion. Incorporating the microarray expression data confirms our findings. We have included the results of this analysis in the following text and figures: Results , Figure 5—figure supplement 2D-E and Materials and methods survival analysis section.

Of note, the results of the survival analysis obtained using the TCGA microarray data are very similar to those obtained using the RNAseq (p-value=0.017 vs 0.019) and are featured in the supplementary figure as validation. However, as only 54% of our signature genes were included in the microarray dataset, the original classification based on RNAseq remains more robust and we have elected to keep that original classification within the main manuscript figure.

6) Please include a small paragraph in the Discussion about the limitations of the study; particularly as they relate to a potentially small samples size.

We have now included a paragraph discussion this limitation in the Discussion section.

https://doi.org/10.7554/eLife.64090.sa2

Article and author information

Author details

  1. Paul Guilhamon

    1. Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
    2. Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    Contribution
    Conceptualization, Formal analysis, Funding acquisition, Validation, Investigation, Visualization, Writing - original draft, Writing - review and editing
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-8276-5987
  2. Charles Chesnelong

    Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  3. Michelle M Kushida

    Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  4. Ana Nikolic

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
    3. Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  5. Divya Singhal

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
    3. Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  6. Graham MacLeod

    Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, Canada
    Contribution
    Resources
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-6401-9307
  7. Seyed Ali Madani Tonekaboni

    1. Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
    2. Department of Medical Biophysics, University of Toronto, Toronto, Canada
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  8. Florence MG Cavalli

    Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    Contribution
    Data curation
    Competing interests
    No competing interests declared
  9. Christopher Arlidge

    Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  10. Nishani Rajakulendran

    Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, Canada
    Contribution
    Resources
    Competing interests
    No competing interests declared
  11. Naghmeh Rastegar

    Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  12. Xiaoguang Hao

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Hotchkiss Brain Institute, University of Calgary, Calgary, Canada
    3. Department of Cell Biology & Anatomy, University of Calgary, Calgary, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-2695-0111
  13. Rozina Hassam

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Hotchkiss Brain Institute, University of Calgary, Calgary, Canada
    3. Department of Cell Biology & Anatomy, University of Calgary, Calgary, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  14. Laura J Smith

    Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, Canada
    Contribution
    Investigation, Visualization
    Competing interests
    No competing interests declared
  15. Heather Whetstone

    Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  16. Fiona J Coutinho

    Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    Contribution
    Project administration
    Competing interests
    No competing interests declared
  17. Bettina Nadorp

    Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
    Contribution
    Formal analysis
    Competing interests
    No competing interests declared
  18. Katrina I Ellestad

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
    Contribution
    Investigation
    Competing interests
    No competing interests declared
  19. H Artee Luchman

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Hotchkiss Brain Institute, University of Calgary, Calgary, Canada
    3. Department of Cell Biology & Anatomy, University of Calgary, Calgary, Canada
    Contribution
    Supervision
    Competing interests
    No competing interests declared
  20. Jennifer Ai-wen Chan

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
    3. Department of Pathology and Laboratory Medicine, University of Calgary, Calgary, Canada
    Contribution
    Resources
    Competing interests
    No competing interests declared
  21. Molly S Shoichet

    Institute of Biomaterials and Biomedical Engineering, University of Toronto, Toronto, Canada
    Contribution
    Supervision
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-1830-3475
  22. Michael D Taylor

    1. Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    2. Division of Neurosurgery, University of Toronto, Toronto, Canada
    3. Departments of Molecular Genetics and Surgery, University of Toronto, Toronto, Canada
    Contribution
    Supervision
    Competing interests
    No competing interests declared
  23. Benjamin Haibe-Kains

    1. Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
    2. Department of Medical Biophysics, University of Toronto, Toronto, Canada
    3. Department of Computer Science, University of Toronto, Toronto, Canada
    4. Ontario Institute for Cancer Research, Toronto, Canada
    5. Vector Institute, Toronto, Canada
    Contribution
    Supervision
    Competing interests
    No competing interests declared
  24. Samuel Weiss

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Hotchkiss Brain Institute, University of Calgary, Calgary, Canada
    3. Department of Cell Biology & Anatomy, University of Calgary, Calgary, Canada
    4. Department of Physiology & Pharmacology, University of Calgary, Calgary, Canada
    Contribution
    Supervision
    Competing interests
    No competing interests declared
  25. Stephane Angers

    1. Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, Canada
    2. Department of Biochemistry, Faculty of Medicine, University of Toronto, Toronto, Canada
    Contribution
    Supervision
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0001-7241-9044
  26. Marco Gallo

    1. Clark Smith Brain Tumour Centre, Arnie Charbonneau Cancer Institute, University of Calgary, Calgary, Canada
    2. Alberta Children’s Hospital Research Institute, University of Calgary, Calgary, Canada
    3. Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, Canada
    4. Department of Physiology & Pharmacology, University of Calgary, Calgary, Canada
    Contribution
    Supervision, Writing - review and editing
    Competing interests
    No competing interests declared
  27. Peter B Dirks

    1. Developmental and Stem Cell Biology Program and Arthur and Sonia Labatt Brain tumor Research Centre, The Hospital for Sick Children, Toronto, Canada
    2. Division of Neurosurgery, University of Toronto, Toronto, Canada
    3. Ontario Institute for Cancer Research, Toronto, Canada
    Contribution
    Conceptualization, Supervision, Writing - original draft, Writing - review and editing
    For correspondence
    peter.dirks@sickkids.ca
    Competing interests
    No competing interests declared
  28. Mathieu Lupien

    1. Princess Margaret Cancer Centre, University Health Network, Toronto, Canada
    2. Department of Medical Biophysics, University of Toronto, Toronto, Canada
    Contribution
    Conceptualization, Supervision, Funding acquisition, Writing - original draft, Project administration, Writing - review and editing
    For correspondence
    mlupien@uhnres.utoronto.ca
    Competing interests
    No competing interests declared
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0003-0929-9478

Funding

Stand Up To Cancer Canada, Genome Canada, CIHR, OICR, AACR (SU2C-AACR-DT-19-15)

  • Michael D Taylor
  • Samuel Weiss
  • Peter B Dirks
  • Mathieu Lupien

CIHR (TGH-158221)

  • Stephane Angers
  • Peter B Dirks
  • Mathieu Lupien

CIHR (MFE 338954)

  • Paul Guilhamon

Princess Margaret Cancer Foundation

  • Mathieu Lupien

CIHR

  • H Artee Luchman
  • Samuel Weiss

Terry Fox Research Institute

  • Stephane Angers
  • Peter B Dirks

Hospital for Sick Children

  • Peter B Dirks

Canadian Cancer Society

  • Stephane Angers

NSERC

  • Marco Gallo

Alliance for Cancer Gene Therapy

  • Marco Gallo

Jessica's Footprint

  • Peter B Dirks

Hopeful Minds

  • Peter B Dirks

The Bresler Family

  • Peter B Dirks

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Acknowledgements

We would like to thank the Princess Margaret Genomics Centre for their assistance in this study and 10× Genomics, especially Adam Jerauld, Ariel Royall, and John Chevillet for training and support. We also thank Aude Gerbaud and Stacey Krunholtz for their help with figure formatting and David Vetrie for his thoughtful input on the manuscript.

Ethics

Human subjects: All tissue samples were obtained following informed consent from patients, and all experimental procedures were performed in accordance with the Research Ethics Board at The Hospital for Sick Children (Toronto, Canada), the University of Calgary Ethics Review Board, and the Health Research Ethics Board of Alberta - Cancer Committee (HREBA). Approval for access to pathological data was obtained from the respective institutional review boards.

Animal experimentation: All animal procedures were performed according to and approved by the Animal Care Committee of the Hospital for Sick Children or the University of Calgary. All attempts are made to minimize the handling time during surgery and treatment so as not to unduly stress the animals. Animals are observed daily after surgery to ensure there are no unexpected complications.

Senior Editor

  1. Kevin Struhl, Harvard Medical School, United States

Reviewing Editor

  1. Lynne-Marie Postovit, University of Alberta, Canada

Reviewer

  1. Roel Verhaak, Jackson Laboratory, United States

Publication history

  1. Received: October 17, 2020
  2. Accepted: January 8, 2021
  3. Accepted Manuscript published: January 11, 2021 (version 1)
  4. Version of Record published: January 29, 2021 (version 2)

Copyright

© 2021, Guilhamon et al.

This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 2,500
    Page views
  • 416
    Downloads
  • 1
    Citations

Article citation count generated by polling the highest count across the following sources: Crossref, PubMed Central, Scopus.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Download citations (links to download the citations from this article in formats compatible with various reference manager tools)

Open citations (links to open the citations from this article in various online reference manager services)

Further reading

    1. Cancer Biology
    2. Developmental Biology
    Rediet Zewdu et al.
    Research Article

    Cancer cells undergo lineage switching during natural progression and in response to therapy. NKX2-1 loss in human and murine lung adenocarcinoma leads to invasive mucinous adenocarcinoma (IMA), a lung cancer subtype that exhibits gastric differentiation and harbors a distinct spectrum of driver oncogenes. In murine BRAFV600E driven lung adenocarcinoma, NKX2-1 is required for early tumorigenesis, but dispensable for established tumor growth. NKX2-1-deficient, BRAFV600E driven tumors resemble human IMA and exhibit a distinct response to BRAF/MEK inhibitors. Whereas BRAF/MEK inhibitors drive NKX2-1-positive tumor cells into quiescence, NKX2-1-negative cells fail to exit the cell cycle after the same therapy. BRAF/MEK inhibitors induce cell identity switching in NKX2-1-negative lung tumors within the gastric lineage, which is driven in part by WNT signaling and FoxA1/2. These data elucidate a complex, reciprocal relationship between lineage specifiers and oncogenic signaling pathways in the regulation of lung adenocarcinoma identity that is likely to impact lineage-specific therapeutic strategies.

    1. Cancer Biology
    2. Cell Biology
    Margaret E Torrence et al.
    Research Article Updated

    The mechanistic target of rapamycin complex 1 (mTORC1) stimulates a coordinated anabolic program in response to growth-promoting signals. Paradoxically, recent studies indicate that mTORC1 can activate the transcription factor ATF4 through mechanisms distinct from its canonical induction by the integrated stress response (ISR). However, its broader roles as a downstream target of mTORC1 are unknown. Therefore, we directly compared ATF4-dependent transcriptional changes induced upon insulin-stimulated mTORC1 signaling to those activated by the ISR. In multiple mouse embryo fibroblast and human cancer cell lines, the mTORC1-ATF4 pathway stimulated expression of only a subset of the ATF4 target genes induced by the ISR, including genes involved in amino acid uptake, synthesis, and tRNA charging. We demonstrate that ATF4 is a metabolic effector of mTORC1 involved in both its established role in promoting protein synthesis and in a previously unappreciated function for mTORC1 in stimulating cellular cystine uptake and glutathione synthesis.