Solitary Fibrous Tumors express a neuronal gene signature.

(a) Table of the eight formalin-Fixed Paraffin-Embedded (FFPE) samples used for RNA-seq listing the sample ID, Age at surgery, Sex, Diagnosis of tumor status, and site of resection.

(b) Graphic demonstrating the workflow for FFPE RNA-seq. RNA was extracted from FFPE tumor and matching normal tissue and sequenced using exon targeted sequencing which increases quality of FFPE RNA- seq’s.

(c) Most abundant gene fusion present in all SFTs in FFPE RNA-seq, NAB2-STAT6 with exons 1-4 of NAB2 and exons 2-22 of STAT6. Graphic generated with Arriba, a fusion detection algorithm.

(d) Volcano plot showing differentially expressed genes in SFTs versus normal matching tissues as determined by FFPE RNA-seq (n = 8). 2,429 genes were upregulated (indicated by red dots) and 3,769 genes were downregulated (indicated by blue dots). Fold change >1, FDR <0.1.

(e) Biological pathway GO analysis of 2,429 upregulated genes in INTS10 KO cells revealed enrichment for developmental and specifically neuronal developmental pathways.

(f) TRANSFAC motif (transcription factor motifs at +/-1kb from transcription start site) GO analysis of 2,429 upregulated genes in INTS10 KO cells revealed enrichment for DB1, MAZ, MOVO-B, EGR1, and WT1 motifs.

© 2024, BioRender Inc. Any parts of this image created with BioRender are not made available under the same license as the Reviewed Preprint, and are © 2024, BioRender Inc.

Generation of an inducible NAB2-STAT6 system to investigate early transcriptional changes.

(a) We generated a doxycycline inducible clone that expresses NAB2-STAT6 (NAB2 exons 1-4, STAT6 exons 2-22) with a C-terminal FLAG tag. Immunoblot analysis of whole cell extracts shows strong expression of NAB2-STAT6 after 1,2, and 3 days of doxycycline treatment using a FLAG antibody. GAPDH was used as control.

(b) Heatmap clustering analysis of 2,430 genes that are differentially expressed (fold change >1, FDR <0.1) across 1, 2, and 3 days of NAB2-STAT6 expression (Dox) as determined by 3’ mRNA Quant-seq (n = 4)

(c) Volcano plot showing differentially expressed genes in cells expressing NAB2-STAT6 (Dox) for 2 days versus control cells as determined by 3’ mRNA Quant-seq (n = 4). 562 genes were upregulated (indicated by red dots) and 211 genes were downregulated (indicated by blue dots). Fold change >1, FDR <0.1.

(e) Biological pathway GO analysis of 562 upregulated genes in INTS10 KO cells revealed enrichment for neuronal developmental pathways.

(e) TRANSFAC motif (transcription factor motifs at +/-1kb from transcription start site) GO analysis of 562 upregulated genes in INTS10 KO cells revealed enrichment for EGR1 and PATZ motifs.

(f) We plotted normalized read counts of NAB1, NAB2, EGR1, EGR2, STAT6, EGR target IGF2 and neuronal markers m (LHX2, ROBO2 and SHOX2) over 3 days of NAB2-STAT6 (Dox) expression. Targets of EGR1 were gradually upregulated; NAB1, NAB2, EGR1, IGF2, LHX2, ROBO2 and SHOX2.

EGR1 targeted promoters and enhancers are activated by NAB2-STAT6.

(a) Average profiles and heatmaps of NAB2-STAT6 FLAG and EGR1 ChIP-seq and ATAC-seq in both control and 2 days NAB2-STAT6 (Dox) expressing U2OS cells at 1394 NAB2-STAT6 FLAG peaks. NAB2-STAT6 FLAG becomes significantly localized to these peaks which have significant increases in EGR1 and ATAC-seq signal.

(b) Motif analysis of 1394 NAB2-STAT6 FLAG peaks using HOMER shows EGR1, EGR2, and WT1 as the most significantly enriched TF matrices.

(c) GSEA shows that genes nearest to NAB2-STAT6 FLAG peaks (n = 1394) are significantly upregulated after 2 days of NAB2-STAT6 (Dox) expression in U2OS cells when compared with control cells from Fig 2.

(d) Screenshot displays two enhancers and the promoter (highlighted in yellow) of KNDC1 that gains NAB2- STAT6 FLAG localization and have increases in EGR1 localization and accessibility by ATAC-seq.

(e) Screenshot displays an enhancer (highlighted in yellow) of IGF2 that gains NAB2-STAT6 FLAG localization and has increased EGR1 localization and accessibility by ATAC-seq.

NAB2-STAT6 localizes to EGR1 targets in primary tumors.

(a) Summary of the strategy to profile NAB2-STAT6 binding in a primary SFT. The primary tumor was designated and single cells were isolated and fixed. Fixed cells were then used for NAB2 and STAT6 ChIP- seq. Peaks overlapping in both NAB2 and STAT6 ChIP-seq were characterized as NAB2-STAT6 peaks.

(b) Average profiles and heatmaps of NAB2, STAT6, and RNAPII ChIP-seq in a primary SFT at 5921 NAB2 only peaks, 718 NAB2-STAT6 peaks, and 1285 STAT6 only peaks. NAB2-STAT6 peaks had significant NAB2, STAT6, and RNAPII signal.

(c) Top 2 motif from motif analysis of 5921 NAB2 only peaks, 718 NAB2-STAT6 peaks, and 1285 STAT6 only peaks using HOMER shows EGR1 and WT1 as the most significantly enriched TF matrices at NAB2 and NAB2-STAT6 sites and GRE and STAT3 at STAT6 sites.

(d) GSEA shows that genes nearest to NAB2-STAT6 peaks (n = 718) are significantly upregulated in SFTs when compared with matching normal tissue from Fig 1.

(e) Screenshot displays the promoter (highlighted in yellow) of KLF10 that has significant NAB2, STAT6, and RNAPII localization in SFTs and in U2OS gains NAB2-STAT6 FLAG localization and has increased EGR1 localization and accessibility by ATAC-seq.

(f) Screenshot displays an enhancer and promoter (highlighted in yellow) of NAB2 that has significant NAB2 and STAT6 localization in SFTs and in U2OS gains NAB2-STAT6 FLAG localization and has increased EGR1 localization and accessibility by ATAC-seq.

© 2024, BioRender Inc. Any parts of this image created with BioRender are not made available under the same license as the Reviewed Preprint, and are © 2024, BioRender Inc.

NAB2-STAT6 interacts with EGR1 and NAB1 directs them to the nucleus.

(a) Eluates from NAB2 and STAT6 IPs from U2OS nuclear extracts expressing NAB2-STAT6 for 1 day were subjected to MudPIT LC-MS/MS analysis for unbiased identification of the top interactors. Log2 iBAQ protein scores of STAT6 IP interactors are plotted against scores of NAB IP. NAB1 and EGR1 were the top interactors.

(b) Control and U2OS cells expressing NAB2-STAT6 for 1 day were sub cellularly fractionated into nuclear and cytoplasmic fractions. Immunoblots analysis shows that NAB2-STAT6 was only present in the nuclear fraction of Dox conditions. STAT6 was nuclear in both conditions. NAB2 and EGR1 were cytoplasmic in control conditions but became nuclear in Dox conditions. GAPDH was cytoplasmic control and HIstone H3 was nuclear control

(c) Immunocytochemistry (ICC) of NAB2 (red), STAT6 (green), and DAPI (blue) in SFT primary cells from Fig 4 and U2OS control and NAB2-STAT6 (Dox) expressing for one day cells. SFT and NAB2-STAT6 expressing cells show strong nuclear staining for NAB2 and STAT6. Control U2OS cells have nuclear STAT6 and cytoplasmic NAB2 staining.

(d) Immunocytochemistry (ICC) of NAB1 (red), FLAG (green), and DAPI (blue) in U2OS control and NAB2- STAT6 (Dox) expressing for one day cells. NAB2-STAT6 expressing cells show strong nuclear staining for FLAG and NAB1. Control U2OS cells have no FLAG and cytoplasmic and nuclear NAB1 staining.

The SFT gene signature resembles EGR1 activated tumors.

(a) Ranking of TCGA tumors by their average single sample gene set enrichment analysis (ssGSEA) score using the SFT gene signature of upregulated genes from Fig 1. (n = 2,429). Neuroendocrine tumors highly express the signature while leukemias down regulate the signature.

(b) Kaplan meier curve showing survival analysis of tumors in the TCGA database stratified by high or low expression of SFT gene signature of upregulated genes from Fig 1 (n = 2,429).

(c) ssGSEA of 87 mesotheliomas from TCGA using SFT gene signature of upregulated genes from Fig 1 (n = 2,429). Shows significant upregulation of the SFT gene signature in the TCGA SH A7BH sample.

(d) NAB2-STAT6 gene fusion in (NAB2 exons 1-6, STAT6 exons 17-22) present in TCGA SH A7BH, originally diagnosed mesothelioma. Graphic generated with Arriba, a fusion detection algorithm.

(e) InterPro domain analysis of the top 400 most upregulated genes in SFTs shows Homeobox and Cadherin as the most significantly enriched protein domains.

(f) Screenshot displays HOXA locus which has significant RNAPII localization in SFTs indicating that the Homeobox genes are actively transcribed in SFTs.

(g) Motif analysis of 11155 distal enhancer (1>kb from nearest TSS) RNAPII peaks using HOMER shows HOX, GRE, and PGR as the most significantly enriched TF matrices.

NAB2-STAT6 drives the expression of EGR1 targets by driving co-activators to the nucleus.

Model for NAB2-STAT6’s function; NAB2-STAT6 is directed to the nucleus by its STAT6 moiety. The fusion protein then directs co-activators NAB1, NAB2, and EGR1 to the nucleus which in turn direct NAB2-STAT6 to EGR1 target promoters and enhancers which are highly activated by the complex of co-activators recruited.

© 2024, BioRender Inc. Any parts of this image created with BioRender are not made available under the same license as the Reviewed Preprint, and are © 2024, BioRender Inc.

Solitary Fibrous Tumors express a neuronal gene signature.

(a) Chromosomal Rearrangements present in SFT 004 detected by Arrriba. The inversion producing NAB2- STAT6 was the most abundant.

(b) Spearman’s correlation analysis of gene expression between 26 previously published RNA-seqs of SFTs (Robinson et al. 2013) and our 8 SFTs and Normal Matching Tissues from the FFPE RNA.

(c) Principal Component Analysis (PCA) of FFPE RNA-seq datasets of SFTs and Normal Matching Tissue. All SFTs cluster away from Normal matching tissue along the PC1 axis (27% of variance), suggesting that broad and consistent transcriptome changes in SFTs.

(d) Biological pathway GO analysis of 3,769 upregulated genes in INTS10 KO cells revealed enrichment for immune and cell signaling pathways.

(e) Dot plot of GSEA analysis shows the upregulation of Neuronal signatures in SFT and downregulation of immune signatures.

(f) TRANSFAC motif (transcription factor motifs at +/-1kb from transcription start site) GO analysis of 3,769 upregulated genes in INTS10 KO cells revealed enrichment for HTF4 and MRF4, transcription factors active in mesoderm derived tissues.

Generation of an inducible NAB2-STAT6 system to investigate early transcriptional changes.

(a) Principal Component Analysis (PCA) of 3’ mRNA Quant-seq (n = 4) datasets of Dox treated (1, 2, or 3 days) and control cells. The longer NAB2-STAT6 is expressed the more cells (Dox) move further along the PC1 axis (27% of variance), suggesting NAB2-STAT6 induces broad and consistent transcriptome changes in U2OS cells.

(b) Biological pathway GO analysis of 299 upregulated genes in cluster 1 of heatmap from Fig 2b. revealed enrichment for translation and biosynthetic pathways.

(c) TRANSFAC motif (transcription factor motifs at +/-1kb from transcription start site) GO analysis 299 upregulated genes in cluster 1 of heatmap from Fig 2b. revealed enrichment for DP-1 and ZF5 motifs. (d)Heatmap clustering analysis of 166 genes that are differentially expressed (fold change >1, FDR <0.1) across 1, 2, and 3 days of NAB2-STAT6 expression (Dox) and SFTs as determined by 3’ mRNA Quant-seq (n = 4). Includes EGR1 targets like IGF2, LHX2, and ROBO2

EGR1 targeted promoters and enhancers are activated by NAB2-STAT6.

(a) Pie chart showing the percentage of enhancer (1kb > from nearest TSS) and promoter (1 kb < from nearest TSS) sites from NAB2-STAT6 Flag Peaks (n = 1394).

(b) Average profiles and heatmaps of NAB2 and STAT6 in both control U2OS cells at 1394 NAB2-STAT6 FLAG peaks. There is significant NAB2 but no STAT6 localization.

(c) Average profiles and heatmaps of STAT6 in control U2OS cells and NAB2-STAT6 FLAG in control and 2 days NAB2-STAT6 (Dox) expressing U2OS cells at 4488 STAT6 control peaks. There is significant STAT6 but limited NAB2-STAT6 FLAG localization

(d) Screenshot displays two enhancers (highlighted in yellow) of UNCX that gains NAB2-STAT6 FLAG localization and have increases in EGR1 localization and accessibility by ATAC-seq.

(e) Screenshot displays the promoter (highlighted in yellow) of EGR1 that gains NAB2-STAT6 FLAG localization and has increased in EGR1 localization and accessibility by ATAC-seq.

NAB2-STAT6 localizes to EGR1 targets in primary tumors.

(a) The most abundant gene fusion present in the primary SFT used in Fig 4 by RNA-seq was NAB2-STAT6 with exons 1-6 of NAB2 and exons 17-22 of STAT6. Graphic generated with Arriba, a fusion detection algorithm.

(b) Venn diagram of overlapping NAB2 and STAT6 peaks from ChIP-seq in control U2OS cells showing minimal overlap in physiological conditions confirming validity of dual antibody from Fig 4a.

(c) Pie chart showing the percentage of enhancer (1kb > from nearest TSS) and promoter (1 kb < from nearest TSS) sites from SFT NAB2-STAT6 Peaks (n = 718).

(d) Motif analysis of 5921 NAB2 only peaks using HOMER shows EGR1, EGR2, and WT1 as the most significantly enriched TF matrices.

(e) Motif analysis of 718 NAB2-STAT6 peaks using HOMER shows EGR1 and WT1 as the most significantly enriched TF matrices.

(f) Motif analysis of 1285 STAT6 only peaks using HOMER shows GRE and STAT3 as the most significantly enriched TF matrices.

(g) Venn diagram of overlapping U2OS NAB2-STAT6 FLAG and SFT NAB2-STAT6 peaks.

(h) Screenshot displays an enhancer (highlighted in yellow) of ARHGAP32 that has significant NAB2, STAT6, and RNAPII localization in SFTs and in U2OS gains NAB2-STAT6 FLAG localization and has increased EGR1 localization and accessibility by ATAC-seq.

NAB2-STAT6 interacts with EGR1 and NAB1 directs them to the nucleus.

(a) Eluates from EGR1 IPs from control U2OS and U2OS nuclear extracts expressing NAB2-STAT6 for 1 day were subjected to MudPIT LC-MS/MS analysis for unbiased identification of the top interactors. Log2 iBAQ protein scores of control IP interactors are plotted against scores of NAB2-STAT6 expressing IP. NAB2 and STAT6 were only pulled down in NAB2-STAT6 expressing cells.

(b) We generated a clone that constitutively expressed NAB2-STAT6 (NAB2 exons 1-4, STAT6 exons 2-22) with a C-terminal FLAG tag in HEK293T. Immunoblot analysis of whole cell extracts shows strong expression of NAB2-STAT6. GAPDH was used as control.

(C) Eluates from FLAG IPs from U2OS nuclear extracts expressing NAB2-STAT6 for 1 day and HEK293T clone #1 constitutively expressing NAB2-STAT6 were subjected to MudPIT LC-MS/MS analysis for unbiased identification of the top interactors. Log2 iBAQ protein scores of U2OS IP interactors are plotted against scores of HEK293T IP. NAB1 was the top interactor.

(d) Peptide counts pulled down from IP-MS.

(e) Map showing the position of peptides pulled down by U2OS NAB2-STAT6 FLAG IP on NAB2. Peptides were pulled down that are only present in the WT NAB2, not NAB2-STAT6.

(f) Map showing the position of peptides pulled down by HEK293T NAB2-STAT6 FLAG IP on NAB2. Peptides were pulled down that are only present in the WT NAB2, not NAB2-STAT6.

(g) Immunocytochemistry (ICC) of NAB2 (red), STAT6 (green), and DAPI (blue) in WT U2OS cells. NAB2 was primarily cytoplasmic and STAT6 primarily nuclear.

(h) Immunocytochemistry (ICC) of NAB1 (red), FLAG (green), and DAPI (blue) in WT U2OS cells. NAB2 was cytoplasmic and nuclear while FLAG staining was absent.

The SFT gene signature resembles EGR1 activated tumors.

Log2 fold change in Spearman correlation from Normal Matching tissue to SFT (from Fig 1) with human tissues by analyzing the RNA-seq datasets of primary human samples from the Human Protein Atlas project (proteinatlas.org). SFTs become more correlated with neuronal tissues and less with lung tissue.