Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data

  1. Nighat Noureen
  2. Zhenqing Ye
  3. Yidong Chen
  4. Xiaojing Wang
  5. Siyuan Zheng  Is a corresponding author
  1. The University of Texas Health Science Center at San Antonio, United States
  2. University of Texas Health Science Center at San Antonio, United States

Abstract

Quantifying the activity of gene expression signatures is common in analyses of single-cell RNA sequencing data. Methods originally developed for bulk samples are often used for this purpose without accounting for contextual differences between bulk and single-cell data. More broadly, these methods have not been benchmarked. Here we benchmark five such methods, including single sample gene set enrichment analysis (ssGSEA), Gene Set Variation Analysis (GSVA), AUCell, Single Cell Signature Explorer (SCSE), and a new method we developed, Jointly Assessing Signature Mean and Inferring Enrichment (JASMINE). Using cancer as an example, we show cancer cells consistently express more genes than normal cells. This imbalance leads to bias in performance by bulk-sample-based ssGSEA in gold standard tests and down sampling experiments. In contrast, single-cell-based methods are less susceptible. Our results suggest caution should be exercised when using bulk-sample-based methods in single-cell data analyses, and cellular contexts should be taken into consideration when designing benchmarking strategies.

Data availability

The current manuscript is a computational study, so no data have been generated for this manuscript. Single cell data sets used in this study including their downloading sources were listed in Supplementary Table 1. Gene sets were downloaded from MSigDB v.7.2. JASMINE source code is available on Github (https://github.com/NNoureen/JASMINE). Source Data contain the numerical data used to generate the figures.

The following data sets were generated
The following previously published data sets were used

Article and author information

Author details

  1. Nighat Noureen

    Greehey Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, United States
    Competing interests
    The authors declare that no competing interests exist.
  2. Zhenqing Ye

    Greehey Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, United States
    Competing interests
    The authors declare that no competing interests exist.
  3. Yidong Chen

    Greehey Children's Cancer Research Institute, University of Texas Health Science Center at San Antonio, San Antonio, United States
    Competing interests
    The authors declare that no competing interests exist.
  4. Xiaojing Wang

    Greehey Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, United States
    Competing interests
    The authors declare that no competing interests exist.
  5. Siyuan Zheng

    Greehey Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, United States
    For correspondence
    zhengs3@uthscsa.edu
    Competing interests
    The authors declare that no competing interests exist.
    ORCID icon "This ORCID iD identifies the author of this article:" 0000-0002-1031-9424

Funding

Cancer Prevention and Research Institute of Texas (RR170055)

  • Siyuan Zheng

Cancer Prevention and Research Institute of Texas (RP170345)

  • Nighat Noureen

The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.

Reviewing Editor

  1. C Daniela Robles-Espinoza, International Laboratory for Human Genome Research, Mexico

Version history

  1. Preprint posted: July 1, 2021 (view preprint)
  2. Received: July 6, 2021
  3. Accepted: February 25, 2022
  4. Accepted Manuscript published: February 25, 2022 (version 1)
  5. Version of Record published: March 11, 2022 (version 2)

Copyright

© 2022, Noureen et al.

This article is distributed under the terms of the Creative Commons Attribution License permitting unrestricted use and redistribution provided that the original author and source are credited.

Metrics

  • 11,509
    views
  • 684
    downloads
  • 22
    citations

Views, downloads and citations are aggregated across all versions of this paper published by eLife.

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Nighat Noureen
  2. Zhenqing Ye
  3. Yidong Chen
  4. Xiaojing Wang
  5. Siyuan Zheng
(2022)
Signature-scoring methods developed for bulk samples are not adequate for cancer single-cell RNA sequencing data
eLife 11:e71994.
https://doi.org/10.7554/eLife.71994

Share this article

https://doi.org/10.7554/eLife.71994

Further reading

    1. Cancer Biology
    2. Cell Biology
    Timothy J Walker, Eduardo Reyes-Alvarez ... Lois M Mulligan
    Research Article

    Internalization from the cell membrane and endosomal trafficking of receptor tyrosine kinases (RTKs) are important regulators of signaling in normal cells that can frequently be disrupted in cancer. The adrenal tumor pheochromocytoma (PCC) can be caused by activating mutations of the rearranged during transfection (RET) receptor tyrosine kinase, or inactivation of TMEM127, a transmembrane tumor suppressor implicated in trafficking of endosomal cargos. However, the role of aberrant receptor trafficking in PCC is not well understood. Here, we show that loss of TMEM127 causes wildtype RET protein accumulation on the cell surface, where increased receptor density facilitates constitutive ligand-independent activity and downstream signaling, driving cell proliferation. Loss of TMEM127 altered normal cell membrane organization and recruitment and stabilization of membrane protein complexes, impaired assembly, and maturation of clathrin-coated pits, and reduced internalization and degradation of cell surface RET. In addition to RTKs, TMEM127 depletion also promoted surface accumulation of several other transmembrane proteins, suggesting it may cause global defects in surface protein activity and function. Together, our data identify TMEM127 as an important determinant of membrane organization including membrane protein diffusability and protein complex assembly and provide a novel paradigm for oncogenesis in PCC where altered membrane dynamics promotes cell surface accumulation and constitutive activity of growth factor receptors to drive aberrant signaling and promote transformation.

    1. Cancer Biology
    2. Genetics and Genomics
    Ting Zhang, Alisa Ambrodji ... Steven M Offer
    Research Article

    Enhancers are critical for regulating tissue-specific gene expression, and genetic variants within enhancer regions have been suggested to contribute to various cancer-related processes, including therapeutic resistance. However, the precise mechanisms remain elusive. Using a well-defined drug-gene pair, we identified an enhancer region for dihydropyrimidine dehydrogenase (DPD, DPYD gene) expression that is relevant to the metabolism of the anti-cancer drug 5-fluorouracil (5-FU). Using reporter systems, CRISPR genome-edited cell models, and human liver specimens, we demonstrated in vitro and vivo that genotype status for the common germline variant (rs4294451; 27% global minor allele frequency) located within this novel enhancer controls DPYD transcription and alters resistance to 5-FU. The variant genotype increases recruitment of the transcription factor CEBPB to the enhancer and alters the level of direct interactions between the enhancer and DPYD promoter. Our data provide insight into the regulatory mechanisms controlling sensitivity and resistance to 5-FU.