Spatial compartment identity segregates cells by subtype and epithelial-mesenchymal status.

a) Primary cells and cell lines used in this study are arranged depending on their molecular subtypes and malignancy status based on prior available characteristics. Green: normal breast and lung epithelium, red: luminal subtype, blue: HER2-enriched subtype, pink: triple-negative subtype, brown: localized lung cancer. b) Hierarchical clustering of genome-wide compartment identity data for all breast and lung cell types at 250 kb resolution; colors as in 1a. c) Principal component analysis of genome-wide compartment identity data for breast and lung cell types at 250 kb resolution; colors as in 1a. The PC2 axis is divided into two subspaces based on HER2 status (blue thick dotted line). d) GO term enrichment (biological processes) of the genes in genomic regions corresponding to the top 100 positive and 100 negative elements of the first eigenvector of genome-wide compartment identity PCA.

Gene expression differences also segregate breast cancer cell lines along an EMT axis.

a) Principal component analysis of transcriptome data for breast and lung cell types in this study. Cell types are colored based on their molecular subtypes as represented in Fig. 1a. b) GO term enrichment (biological processes) of the genes corresponding to top 100 positive and 100 negative elements of the first eigenvector of the transcriptome profile PCA. c, d) Projection of breast and lung cell types on a curated epithelial and mesenchymal axis based on their gene expression using gene set variation analysis (GSVA) (2c) and non-negative principal component analysis (nnPCA) (2d) methods.

Transcription and compartment changes capture distinct sets of EMT-related genomic regions.

a) Overlap analysis of the genes obtained from the compartmental analysis PC1 (red), the transcriptome analysis PC1 (green), and the curated breast cancer epithelial-mesenchymal gene set (blue). b-d) Hierarchical clustering of compartmental profiles (top row) and gene expression (bottom row) of the genes obtained from either (b) the compartmental analysis PC1, (c) transcriptome analysis PC1, or (d) the curated set of breast cancer epithelial and mesenchymal genes.

a) Normal breast epithelium (HMEC and MCF10A), localized breast cancer (BT474, HCC1954, and BT549), and lung metastatic breast cancer (MCF7, T47D, SKBR3, and MDA-MB-231) cells are used to model different stages of breast cancer progression. To model diseased lung, normal lung epithelium (HTBE) and localized lung cancer (A549 and H460) cells are used. b) Schematic diagram representing breast cancer lung permissive changes calculation. First, for both breast and lung, cancer cell compartment identity is compared with the normal epithelial cell. This results in four possible compartment identity combinations: AA (A compartment in both normal and cancer), AB (A compartment in normal and B compartment in cancer), etc. Then, a cross-comparison between the breast and lung systems leads to 16 compartment identity combinations. Two specific combinations: AB_BB and BA_AA are defined as lung permissive changes (yellow arrows). c) Fraction of lung permissive changes shown by localized and metastatic cancers from different breast cancer subtypes. d-f) TCGA patient gene expression (left) and GO term enrichment (biological processes) (right) for genes from regions exhibiting (d) BA_AA lung permissive changes in the luminal metastatic breast cancer subtype, (e) AB_BB lung permissive changes in case of HER2-enriched metastatic breast cancer subtype, or (f) BA_AA lung permissive changes in case of triple-negative metastatic breast cancer subtype. TCGA data from BRCA and LUAD tumor vs. normal sets used and sample size indicated.