Overview of the retriever algorithm.

retriever generates disease-specific transcriptional drug response signatures by merging transcriptional signatures over time, concentration, and cell-type. These signatures can then be matched to single-cell or bulk expression profiles to predict drugs and drug combinations most likely to be effective in treating a disease.

Single-cell atlas of breast samples.

(A) UMAP projection of the integrated 77,384 cells from 36 breast samples. (B) UMAP projection of the healthy (left) and cancer (center) cells, and visualization of triple-positive epithelial cells in the cancer samples. (C) Dotplot representing the normalized expression level and percentage of cells expressing the top five differentially expressed genes for each cell type.

Transcriptional changes identified in TNBC cells.

(A) Volcano plots report the difference between TNBC and healthy samples, on the left based on single-cell RNA-seq count data computed using MAST, in the middle using the pseudo bulk measures in each single-cell RNA-seq sample, and on the right using the bulk RNA-seq data from the BRCA-TCGA project. Each dot represents a gene. Dots are color coded, in red if the log2 fold-change is larger than 1 and in blue if the log2 fold-change is smaller than -1. (B) Comparisons of the transcriptional changes associated with TNBC at the single-cell, pseudo bulk and tissue level. Each dot represents a gene. Dots are color coded as in Figure 2A. (C) Single-sample GSEA Enrichment Score (ES) for the transcriptional changes between TNBC and healthy cells at different levels of resolution. Labeled pathways are those that show same trend (positive or negative ES) in the three data types.

Case example showing the construction of a single transcriptional response profile to a compound across three TNBC cell lines.

Frames of the scatterplots displaying the relationship between the profiles and the averaged profiles are color coded, in black if the computed Spearman correlation coefficient is larger than 0.6 or in gray otherwise. (A) Generation of a time-consistent response profile. (B) Generation of a time and concentration-consistent response profile. (C) Generation of a time, concentration, and cell-line-consistent response profile.

Correlation analysis and mechanism of action prediction for the disease-specific drug response profiles in TNBC.

(A) Spearman correlation analysis between the expression changes associated with TNBC and the disease-specific drug response signature of QL-XII-47. Each dot represents a gene. Dots are color coded, in red if they are expected to be upregulated by the drug, in blue if they are expected to be downregulated by the drug, and in gray if no significant change is expected. The red line represents the perfect match between both profiles. Density lines reflect the number of dots in each section of the plot. (B) Mechanisms of action predicted for QL-XII-47 in TNBC based on the enrichment of biological pathways in the disease-specific transcriptional drug response signature. (C) Independent sensitivity evaluation of the effect of QL-XII-47 in two other TNBC cell lines. (D) Spearman correlation analysis between the expression changes associated with TNBC and the combination signature of QL-XII-47 and GSK-690693. Each dot represents a gene. Dots are color coded as in (A). The red line represents the perfect match between both profiles. Density lines reflect the number of dots in each section of the plot. (E) Mechanisms of action predicted for the mixture of QL-XII-47 and GSK-690693 based on the enrichment of their combination drug response signature. (F) Comparisons of cellular viability among three distinct breast cancer cell lines following treatment with 0.6 µM QL-XII-47, 0.8 µM GSK-690693, and the combination of both compounds under same concentration. In red, is the expected viability under the drug effect additive scenario. P values were calculated using a one-sided t-test: *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001.

Heatmap displaying the Spearman correlation coefficients for the top-ranking compounds and compound combinations in each sample within the generated atlas.

The compounds or combinations that ranked 1st in at least one sample are included. The number of times each compound or combination ranked 1st across all samples is indicated in parentheses.

Comparisons of ESR1, PGR, and ERBB2 expression levels across epithelial cells in cancer and healthy tissues.

TP stands for Triple-Positive, and TN for Triple-Negative.

Comparisons of the transcriptional changes associated with TNBC at the single-cell and tissue level across different subpopulations of cells.

Each dot represents a gene. Dots are color coded, in red if the log2 fold-change is larger than 1 and in blue if the log2 fold-change is smaller than -1. TP = Triple-Positive, TN = Triple-Negative.

Expression of markers for epithelial cells (EPCAM), proliferating cells (MKI67 ), T cells (CD3D), myeloid cells (CD68), B cells (MS4A1), plasmablasts (JCHAIN), endothelial cells (PECAM1), mesenchymal cells (fibroblasts/perivascular-like cells; PDGFRB), and muscular cells (ACTA2).

Hallmark MSigDB signatures associated with the differentially expressed genes observed in triple-negative breast cancer (TNBC).

False Discovery Rate (FDR) computed using Gene Set Enrichment Analysis (GSEA).