Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data

  1. Julien Racle
  2. Kaat de Jonge
  3. Petra Baumgaertner
  4. Daniel E Speiser
  5. David Gfeller  Is a corresponding author
  1. University of Lausanne, Switzerland
  2. Swiss Institute of Bioinformatics, Switzerland
  3. Lausanne University Hospital (CHUV), Switzerland
8 figures, 2 tables and 4 additional files

Figures

Figure 1 with 2 supplements
Estimating the proportion of immune and cancer cells.

(A) Schematic description of our method. (B) Matrix formulation of our algorithm, including the uncharacterized cell types (red box) with no or very low expression of signature genes (green box). (C)…

https://doi.org/10.7554/eLife.26476.003
Figure 1—figure supplement 1
Low dimensionality representation of the tumor-infiltrating cell samples.

Principal component analysis of the samples used to build the reference gene expression profiles from tumor-infiltrating immune cells, based on the data from Tirosh et al. (2016), considering only …

https://doi.org/10.7554/eLife.26476.004
Figure 1—figure supplement 2
Cell type mRNA content.

(A) mRNA content per cell type obtained for cell types sorted from blood. Values for B, NK, T cells and monocytes were obtained as described in Materials and methods. Values for Neutrophils are from …

https://doi.org/10.7554/eLife.26476.005
Figure 2 with 4 supplements
Predicting cell fractions in blood samples.

(A) Predicted vs. measured immune cell proportions in PBMC (dataset 1 (Zimmermann et al., 2016), dataset 2 (Hoek et al., 2015)) and whole blood (dataset 3 (Linsley et al., 2014)); predictions are …

https://doi.org/10.7554/eLife.26476.006
Figure 2—figure supplement 1
Comparison of multiple cell fraction prediction methods in blood datasets.

Heatmaps show (A) the Pearson R correlation and (B) the root mean squared error, between the cell fractions predicted by each method and the experimentally measured fractions (dataset 1 [Zimmermann …

https://doi.org/10.7554/eLife.26476.007
Figure 2—figure supplement 2
Effect of including an mRNA renormalization step for multiple cell fraction prediction methods.

Pearson R correlations are shown as in Figure 2—figure supplement 1A, showing here for each method its original result and the result if the predicted proportions are then renormalized by the mRNA …

https://doi.org/10.7554/eLife.26476.008
Figure 2—figure supplement 3
Effect of the various steps in EPIC on the prediction accuracy.

Comparison of the predictions as done in Figure 2—figure supplement 1A, for different variations from EPIC: (1) full EPIC method; (2) EPIC if the gene expression reference profiles are scaled a …

https://doi.org/10.7554/eLife.26476.009
Figure 2—figure supplement 4
Results with or without known reference profiles for T cells for the cell fraction predictions from various methods.

Results are shown similarly than in Figure 2—figure supplement 1A. Here, we present for various cell fraction prediction methods the results considering all the immune cell types in the gene …

https://doi.org/10.7554/eLife.26476.010
Figure 3 with 1 supplement
Predicting cell fractions in solid tumors with reference profiles from circulating cells.

(A) Comparison of EPIC predictions with our flow cytometry data of lymph nodes from metastatic melanoma patients. (B) Comparison with immunohistochemistry data from colon cancer primary tumors (Becht…

https://doi.org/10.7554/eLife.26476.011
Figure 3—figure supplement 1
Sketch of the experiment designed to validate EPIC predictions starting from in vivo tumor samples.
https://doi.org/10.7554/eLife.26476.012
Figure 4 with 1 supplement
Predictions with reference profiles from tumor-infiltrating cells.

Same as Figure 3 but based on reference profiles built from the single-cell RNA-Seq data of primary tumor and non-lymphoid metastatic melanoma samples from Tirosh et al. (2016). (A) Comparison with …

https://doi.org/10.7554/eLife.26476.013
Figure 4—figure supplement 1
Comparison of EPIC results per cell type for gene expression reference profiles from circulating or tumor-infiltrating immune cells.

(A) Pearson R correlation and (B) RMSE between the cell fractions predicted and the experimentally measured fractions (from flow cytometry of lymph nodes from metastatic melanoma patients (this …

https://doi.org/10.7554/eLife.26476.014
Figure 5 with 6 supplements
Performance comparison with other methods in tumor samples.

(A) Pearson correlation R-values between the cell proportions predicted by EPIC and ISOpure and the observed proportions measured by flow cytometry or single-cell RNA-Seq (Tirosh et al., 2016), …

https://doi.org/10.7554/eLife.26476.015
Figure 5—figure supplement 1
Comparison of multiple cell fraction prediction methods in tumor datasets.

(A) Pearson R correlation and (B) root mean squared error between the cell fractions predicted by each method and the experimentally measured fractions (from flow cytometry (this study), colorectal …

https://doi.org/10.7554/eLife.26476.016
Figure 5—figure supplement 2
Comparison of cell fraction prediction methods with flow cytometry data of melanoma tumors.

(A) Comparison directly of all cell types together. When a cell type could not be predicted by a given method, this cell type is absent from the subfigure. (B) Comparison per cell type for …

https://doi.org/10.7554/eLife.26476.017
Figure 5—figure supplement 3
Comparison of cell fraction prediction methods with immunohistochemistry data in colon cancer data (Becht et al., 2016) for T cell, CD8 T cell and macrophage infiltration values.

Observed values are in number of cells/mm2. Correlation values are available in Figure 5—figure supplement 1.

https://doi.org/10.7554/eLife.26476.018
Figure 5—figure supplement 4
Comparison of cell fraction prediction methods with single-cell RNA-Seq data from melanoma tumors (Tirosh et al., 2016).

(A) Comparison directly of all cell types together. When a cell type could not be predicted by a given method, this cell type is absent from the subfigure. (B) Results for MCP-counter, splitting the …

https://doi.org/10.7554/eLife.26476.019
Figure 5—figure supplement 5
Comparison between ESTIMATE scores (A) and EPIC predictions (B) in our new flow cytometry dataset.

The predictions are compared to the observed cell proportions. ESTIMATE returns a score of global immune infiltration and thus the sum of all observed immune cells has been taken for the comparison. …

https://doi.org/10.7554/eLife.26476.020
Figure 5—figure supplement 6
Predicting Thelper and Treg cell fractions in tumors.

The proportions of Thelper and Treg cells predicted by EPIC and CIBERSORT are compared to the proportions observed in the bulk samples reconstructed from the single-cell RNA-seq data from melanoma …

https://doi.org/10.7554/eLife.26476.021
Author response image 1
Comparison between EPIC predictions and measured cell fractions in PBMC dataset from Zimmermann et al. 2016.
https://doi.org/10.7554/eLife.26476.031
Author response image 2
Comparison between the experimentally measured cell fractions and EPIC predictions, including additional cell types in: (A) our expanded flow cytometry analysis of melanoma; (B) lymph node metastasis and primary tumor melanoma data from Tirosh et al., 2016.
https://doi.org/10.7554/eLife.26476.032
Author response image 3
Comparison of the prediction accuracies for EPIC, ISOpure based on all genes and ISOpure based on the subset of signature genes we derived for EPIC.

(A) For all immune cell types in the blood datasets (dataset 1: Zimmermann et al. 2016; dataset2: Hoek et al. 2015; dataset 3: Linsley et al. 2014). (B) and (C) in the tumor datasets, based on all …

https://doi.org/10.7554/eLife.26476.033

Tables

Appendix 1—table 1
Gene markers used per cell type.

Only markers of cell types present in the respective reference gene expression profiles are used.

https://doi.org/10.7554/eLife.26476.027
Cell typeGenes markers
B cellsBANK1, CD79A, CD79B, FCER2, FCRL2, FCRL5, MS4A1, PAX5, POU2AF1, STAP1, TCL1A
CAFsADAM33, CLDN11, COL1A1, COL3A1, COL14A1, CRISPLD2, CXCL14, DPT, F3, FBLN1, ISLR, LUM, MEG3, MFAP5, PRELP, PTGIS, SFRP2, SFRP4, SYNPO2, TMEM119
CD4 T cellsANKRD55, DGKA, FOXP3, GCNT4, IL2RA, MDS2, RCAN3, TBC1D4, TRAT1
CD8 T cellsCD8B, HAUS3, JAKMIP1, NAA16, TSPYL1
Endothelial cellsCDH5, CLDN5, CLEC14A, CXorf36, ECSCR, F2RL3, FLT1, FLT4, GPR4, GPR182, KDR, MMRN1, MMRN2, MYCT1, PTPRB, RHOJ, SLCO2A1, SOX18, STAB2, VWF
MacrophagesAPOC1, C1QC, CD14, CD163, CD300C, CD300E, CSF1R, F13A1, FPR3, HAMP, IL1B, LILRB4, MS4A6A, MSR1, SIGLEC1, VSIG4
MonocytesCD33, CD300C, CD300E, CECR1, CLEC6A, CPVL, EGR2, EREG, MS4A6A, NAGA, SLC37A2
NeutrophilsCEACAM3, CNTNAP3, CXCR1, CYP4F3, FFAR2, HIST1H2BC, HIST1H3D, KY, MMP25, PGLYRP1, SLC12A1, TAS2R40
NK cellsCD160, CLIC3, FGFBP2, GNLY, GNPTAB, KLRF1, NCR1, NMUR1, S1PR5, SH2D1B
T cellsBCL11B, CD5, CD28, IL7R, ITK, THEMIS, UBASH3A
Appendix 2—table 1
Characteristics of the patients with metastatic melanoma and corresponding lymph node samples.
https://doi.org/10.7554/eLife.26476.029
PatientAge (years)GenderTissue
LAU12559maleiliac lymph node
LAU35570femaleiliac-obturator lymph node
LAU125587maleaxillary lymph node
LAU131481maleiliac-obturator lymph node

Additional files

Supplementary file 1

Gene expression reference profiles, built from TPM (transcripts per million) normalized RNA-Seq data of immune cells sorted from blood as described in the Materials and methods: ‘Reference gene expression profiles from circulating cells’.

The file includes two sheets: (A) the reference gene expression values; (B) the gene variability relating to the reference profile. Columns indicate the reference cell types; rows indicate the gene names.

https://doi.org/10.7554/eLife.26476.022
Supplementary file 2

Gene expression reference profiles built from tumor-infiltrating cells obtained from TPM normalized single-cell RNA-Seq data as described in the Materials and methods: ‘Reference profiles from tumor-infiltrating cells’.

The file includes two sheets: (A) the reference gene expression values; (B) the gene variability relating to the reference profile. Columns indicate the reference cell types; rows indicate the gene names.

https://doi.org/10.7554/eLife.26476.023
Supplementary file 3

Proportion of cells measured in the different datasets: (A) this study; (B) dataset 1 (Zimmermann et al., 2016); (C) dataset 2 (Hoek et al., 2015); (D) dataset 3 (Linsley et al., 2014); and (E) single-cell RNA-Seq dataset (Tirosh et al., 2016).

The ‘Other cells’ type corresponds always to the rest of the cells that were not assigned to one of the given cell types from the tables.

https://doi.org/10.7554/eLife.26476.024
Transparent reporting form
https://doi.org/10.7554/eLife.26476.025

Download links