Figures and data

EPB41L4A-AS1 is a highly expressed lncRNA with a widespread cellular distribution.
(A) Graphical overview of the screen to identify candidate cis-acting lncRNAs. (B) RT-qPCR to assess the expression of each candidate cis-acting lncRNA and their predicted cis-regulated target genes, following KD with two unique GapmeRs. (C) (Top) Overview of the 5q22.1-2 locus, with the bars highlighting the two TADs. (Bottom) Zoom in on the EPB41L4A-AS1 locus, with tracks for the PhyloP conservation score (UCSC genome browser), H3K4me3 and H3K27ac histone modifications (ENCODE), PolyA+ and Ribo(-) RNA-seq coverage (this study), GENCODE transcripts and CAGE reads (FANTOM5). (D) A representative single-molecule RNA FISH (smFISH) image shows EPB41L4A-AS1 intra-cellular localization. GAPDH and DAPI were used to label the cytoplasm and the nucleus, respectively (scale bar = 20 µm, 100X magnification). (E) RT-qPCR analysis of subcellular fractionation experiments. The percentages of RNA in each compartment were obtained by normalizing the expression in the different fractions to that in whole cells. (F) UMI-4C contact profiles using baits targeting the TSSes of STARD4 (left), EPB41L4A-AS1 (center), or EPB41L4A (right). The dotted line represents the center of the EPB41L4A-AS1 locus. All experiments were performed in n=3 biological replicates, except UMI-4C with n=2, with the error bars in the barplots representing the standard deviation. ns = P>0.05; * = P<0.05,**=P<0.01 = **; ***=P<0.001 (two-sided Student’s t-test).

EPB41L4A-AS1 is a cis-acting lncRNA affecting genome-wide gene expression.
(A) (Top) In scale view of the EPB41L4A-AS1-EPB41L4A locus, with zoomed areas corresponding to the two TSS. (Middle) CTCF and H3K27ac coverage across this region. (Bottom) Micro-C data in H1-hESCs44 show continuous contacts throughout the EPB41L4A gene body. (B) RT-qPCR and (C) Western blot for the indicated genes and proteins following EPB41L4A-AS1 KD.(D) H3K27ac CUT&RUN-qPCR following EPB41L4A-AS1 KD with GapmeRs using primers targeting the promoter of EPB41L4A. (E) UMI-4C contact profiles in control and LNA1-transfected cells using baits targeting the TSS of EPB41L4A-AS1. The green area represents the quantified genomic interval, and the p-value was calculated using a Chi-squared test. (F) Changes in gene expression for the genes in the two flaking TADs of the lncRNA in cells transfected with GapmeRs targeting EPB41L4A-AS1. The vertical dotted lines represent the TAD boundaries (as assessed by TADmap), the continuous vertical line the lncRNA locus and inter-TAD boundary, and the horizontal continuous line a log2Fold-change equal to 0. The dots represent individual genes, with the significant ones highlighted. (G) MA plot showing the changes in genome-wide gene expression in cells transfected with GapmeRs targeting EPB41L4A-AS1. (H) Same as (G), but with GapmeRs targeting EPB41L4A. (I) GO enrichment analysis for the upregulated (left) and downregulated (right) genes after EPB41L4A-AS1 KD. All experiments were performed in n=3 biological replicates, except UMI-4C with n=2, with the error bars in the barplots representing the standard deviation. ns - P>0.05; * - P<0.05; ** - P<0.01; *** - P<0.001 (two-sided Student’s t-test). A gene was considered to be differentially expressed if both adjusted P<0.05 and |log2Fold-change| >0.41 (corresponding to a change of 33%).

SUB1 interacts with EPB41L4A-AS1 and affects gene expression at the DNA and RNA levels.
(A) Schematics of the EPB41L4A-AS1 locus with tracks depicting the eCLIP peaks for both SUB1 and NPM1 (source: ENCODE). (B) Average expression levels (in FPKM) of SUB1 and NPM1 in cells transfected with GapmeRs targeting either EPB41L4A-AS1 or EPB41L4A, with the error bars representing the standard deviation across the n = 3 replicates. DESeq2 adjusted P-values compared to control GapmeR are also reported. (C) Changes in gene expression upon EPB41L4A-AS1 KD of the genes ranked by the SUB1 eCLIP binding confidence. (D) Western blot following RIP using either a SUB1 or IgG antibody. TUBULIN was used as a negative control, IN - input, SN - supernatant/unbound, B - bound. (E) Same as (C), but with genes ranked by their enrichment in the RIP data (log2FC RIP / Input). (F) Metagene profile around TSS of the normalized SUB1 CUT&RUN signal, stratified by gene expression levels in MCF-7 cells. (G) Changes in gene expression upon EPB41L4A-AS1 KD with GapmeRs for genes with and without a high-confidence SUB1 peak in their TSS. All experiments were performed in n = 3 biological replicates. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann-Whitney test, and a global ANOVA p-value is also reported. In all cases, ns - P>0.05; * - P<0.05; ** - P<0.01; *** - P<0.001.

EPB41L4A-AS1 and SUB1 depletion results in a widespread accumulation of mature snoRNAs.
(A) Changes in gene expression upon EPB41L4A-AS1 KD with GapmeRs of the indicated RNA classes. (B) Schematics depicting the different regions which were separately quantified in each SNHG. (C) Changes in RNA- seq read coverage in different regions upon EPB41L4A-AS1 KD with GapmeRs. (D) As in (B) for SUB1 KD with siRNAs of the indicated RNA classes. (E) as in (C) for SUB1 KD. (F) Correspondence between changes in snoRNAs expression after EPB41L4A-AS1 and SUB1 KD. The color indicates the different snoRNA classes, and Spearman’s correlation coefficient is shown. All experiments were performed in n = 3 biological replicates. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann-Whitney test, and a global ANOVA P-value is also reported.

EPB41L4A-AS1 and SUB1 depletion results in a widespread accumulation of mature snoRNAs.
(A) Representative immunofluorescence images for SUB1 after EPB41L4A-AS1 depletion with two distinct GapmeRs (scale bar = 20 µm, 100X magnification). (B) Same as in (A), but for NPM1. (C) Quantification of the kurtosis of SUB1 nuclear signal in the indicated conditions. (D) Same as in (C), but for NPM1. (E) Representative immunofluorescence images for SUB1 after SUB1 KD with siRNAs (scale bar = 20 µm, 60X magnification). (F) Same as in (E), but for NPM1. (G) Same as in (D), but after SUB1 depletion. All experiments were performed in n = 3 biological replicates. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann-Whitney test, and a global ANOVA P-value is also reported. In each boxplot, points represent individual measurements (cell nuclei).

EPB41L4A-AS1 but not the expression of SNORA13 is required for proper snoRNAs expression.
(A) Average expression levels (in FPKM) of SNORA13 in cells transfected with GapmeRs targeting EPB41L4A- AS1 (rRNA-depleted RNA-seq), with the error bars representing the standard deviation across the three replicates. DESeq2 adjusted P-values compared to control GapmeR are also reported. (B) Northern blot for SNORA13 in cells transfected with GapmeRs targeting EPB41L4A-AS1. U6 was used as a loading control, and total RNA stain is also shown on top. (C) Schematics of the expected fragment sizes following HinfI digestion in the aRT-PCR assay (left), and agarose gel following HinfI digestion of the PCR-amplified 18S rRNA in cells transfected with GapmeRs targeting either EPB41L4A-AS1 or EPB41L4A (right). No enzyme and genomic DNA (gDNA) were used as negative controls. (D) Allele frequency at 18S:1248 in the polyA+ RNA-seq dataset, using rRNA reads. (E) UCSC Genome Browser view of the EPB41L4A-AS1 locus, with tracks showing the rRNA-depleted RNA-seq coverage of EPB41L4A-AS1 OE and control cells. (F) RT-qPCR for the indicated genes upon EPB41L4A-AS1 KD with GapmeRs (right) or overexpression with a full-length unspliced vector (left) over the course of three days post transfection. (G) RT-qPCR for the indicated snoRNAs upon EPB41L4A-AS1 KD with GapmeRs over the course of three days post transfection. (H) Schematics of the different EPB41L4A-AS1 and SNORA13 overexpressing vectors used in the rescue experiments. (I) The ratio between changes in gene expression detected by RT-qPCR in cells transfected with an EPB41L4A-AS1 or SNORA13 overexpressing vector vs control cells. All experiments were performed in n = 3 biological replicates, with the error bars in the barplots and forest plots representing the standard deviation. In all cases, ns - P>0.05; * - P<0.05; ** - P<0.01; *** - P<0.001 (two-sided Student’s t-test).

The increased abundance of snoRNAs is primarily due to their hosts’ increased transcription and stability.
(A) Workflow of the SLAM-seq experiment. MCF-7 cells were transfected with the indicated GapmeRs for 48 hours, after which the media was replaced with media containing 4sU and the cells were harvested at different time points. (B) Fitted model depicting the synthesis and decay rates of the EPB41L4A mRNA upon EPB41L4A-AS1 KD with GapmeRs. (C) Changes in synthesis rate upon EPB41L4A-AS1 KD with GapmeRs for the indicated group of genes. (D) Same as in (C), but for changes in half lives. (E) Half lives in control conditions of the indicated SNHG regions as described in Fig. 4B. (F) Same as in (E), but after EPB41L4A-AS1 KD with LNA2. (G) Changes in synthesis rate upon EPB41L4A-AS1 KD with GapmeRs for the indicated SNHG regions. (H) Same as in (G), but for changes in half lives. All experiments were performed in n = 3 biological replicates. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann-Whitney test, and a global ANOVA P-value is also reported. In all cases, ns - P>0.05; * - P<0.05; ** - P<0.01; *** - P<0.001.

Cells with reduced EPB41L4A-AS1 expression display reduced proliferation and increased invasion capacity.
(A) Representative brightfield images of the wound at the indicated time points and conditions. (B) Wound area (top-left), normalized wound area (top-right), closure percentage (bottom-left) and migration rate (bottom-right) at the indicated time points and conditions. The significance was calculated by a two-sided Student’s t-test. (C) Growth curve of the cells in the indicated conditions over the course of three days. The significance was calculated by a two-sided Student’s t-test. (D) Changes in gene expression of the EMT signature genes (from MSigDB) after KD with GapmeRs targeting either EPB41L4A-AS1 or EPB41L4A (polyA+ RNA-seq data). The significance of the comparison was computed by a Mann-Whitney test. All experiments were performed in n = 3 biological replicates. The error bars in the barplots represent the standard deviation. In the boxplot, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points. The points in the boxplot represent individual genes, and their color whether they were found to be significantly (adjusted P<0.05 and |log2Fold-change| >0.41) dysregulated (red) or not (black). In all cases, ns - P>0.05 = ns; * - P<0.05 = *; ** - P<0.01 = **; *** - P<0.001 = *** (two-sided Student’s t-test).

UCSC Genome Browser view of the loci containing the selected lncRNAs and target genes for validation.
View of the EPB41L4A-AS1 - EPB41L4A (A), LINC00938 - ARID2 (B), MIR9-3HG - POLG (C) and NET1e - NET1 - CALML5 (D) loci. In each view, the PhyloP conservation score and GeneHancer (GH) elements are also reported.

EPB41L4A-AS1 is dysregulated in several cancer types and correlates with survival.
(A) Expression level of EPB41L4A-AS1 (left) and EPB41L4A (right) across the TCGA cohort. The thick lines and each point represent the median and an individual sample, respectively, and the color on top reflects whether the indicated genes are significantly more expressed in control or tumor conditions. (B) Correspondence between the expression of EPB41L4A-AS1 and EPB41L4A in GTEx v8 (left), TCGA control (center) and tumor (right) samples. The Spearman’s correlation coefficient and p-value is also reported. (C) Kaplan-Meier survival curve stratified into individuals with the lower and higher 50% expression level of EPB41L4A-AS1. (D) Expression level of EPB41L4A-AS1 (left) and EPB41L4A (right) in patients with the different BRCA subtypes and matched controls. The thick lines, edges of the box, whiskers and each point represent the median, first and third quartiles, the upper and lower 1.5 interquartile ranges (IQRs), and an individual sample, respectively. (E) Correspondence between the expression of EPB41L4A-AS1 and EPB41L4A in BRCA samples. (F) Expression level of EPB41L4A-AS1 in patients at different BRCA stages. The white dot and edges of the box represent the median and the first and third quartiles, respectively. The one-way ANOVA p-value for differential expression is also reported. In all cases, P<0.05 = *. Data was accessed using the GEPIA2 portal98.

EPB41L4A-AS1 expression is altered upon multiple stimuli.
(A) CPAT37 analysis of the indicated transcripts. As controls for a non-coding and a coding transcript, XIST and ACTB are reported. Changes in gene expression for the indicated genes (B) during a time course of 7 days in serum starvation conditions, (C) following the release from a double thymidine cell cycle block, (D) in a panel of 250 unique cell type-treatment combination (the Spearman’s correlation coefficient and trendline are also reported), (E) and after exposing MCF-7 cells to LPS, H2O2, thapsigargin and etoposide. When applicable, all experiments were performed in n = 3 biological replicates, with the error bars in the dotplots and boxplots representing the standard deviation. In all cases, ns - P>0.05 = ns; * - P<0.05 = *; ** - P<0.01 = **; *** - P<0.001 = *** (two-sided Student’s t-test).

EPB41L4A-AS1 unidirectionally facilitates EPB41L4A expression in cis.
RT-qPCR to assess the expression of the reported genes after (A) transfection with siRNAs against EPB41L4A-AS1, (B) CRISPRa with guides targeting the EPB41L4A-AS1 promoter, (C) transfection with plasmid encoding the EPB41L4A-AS1 cDNA, transfection with GapmeRs (D) and siRNAs (E) targeting EPB41L4A, (F) CRISPRa with guides targeting the EPB41L4A promoter and (G) transfection with plasmid encoding the EPB41L4A cDNA. (H) Changes in EPB41L4A-AS1 expression after rescuing EPB41L4A-AS1 with an ectopic plasmid or CRISPRa following its KD with GapmeRs. Asterisks indicate significance relative to the –/– control. (I) Same as in (H), but for changes in EPB41L4A expression. (J) UMI-4C contact profiles in control and LNA2-transfected cells using baits targeting the TSS of EPB41L4A-AS1. The green area represents the quantified genomic interval, and the p-value was calculated using a Chi-squared test. All experiments were performed in n = 3 biological replicates, except UMI-4C with n=2, with the error bars in the boxplots representing the standard deviation. In all cases, ns - P>0.05 = ns; * - P<0.05 = *; ** - P<0.01 = **; *** - P<0.001 = *** (two-sided Student’s t-test).

Most of the transcriptomic changes following EPB41L4A-AS1 downregulation are not explained by EPB41L4A.
(A) Hierarchical clustering (top) and Principal Component Analysis (PCA, bottom) of the polyA+ RNA-seq libraries. The color in the heatmap represents the Euclidean distance between the libraries. (B) Heatmap of the differentially expressed genes shared between EPB41L4A-AS1 and EPB41L4A depletions. Color intensity reflects the changes in gene expression (log2FC). (C) GO enrichment analysis of the genes downregulated following EPB41L4A depletion. (D) Barplot of the lncRNAs (annotated in GENCODE) detected in MCF-7 cells, ordered by their expression. EPB41L4A-AS1 and other representative cis- or trans-acting lncRNAs are highlighted. All experiments were performed in n = 3 biological replicates, and a gene was considered to be differentially expressed if both adjusted P<0.05 and |log2Fold-change| >0.41 (corresponding to a change of 33%)

SUB1 is both a chromatin-associated and an RNA-binding protein.
(A) Gene set enrichment analysis (GSEA) using cellular component as ontology. (B) Running score and preranked list of the nucleolus-associated genes in the GSEA analysis. (C) RT-qPCR for the indicated genes after RIP with either an anti-SUB1 or control IgG antibody. (D) MA plot showing the relative enrichment in RIP-seq after using either an anti-SUB1 or control IgG antibody. (E) Forest plot for the enrichment of the genes in the ENCODE SUB1 eCLIP in our RIP-seq data. The genes in the eCLIP dataset were ranked according to their enrichment score. (F) GO enrichment analysis of the genes with a SUB1 peak as determined from our CUT&RUN data. All experiments were performed in n = 3 biological replicates, with the error bars representing the standard deviation (C) or the 95% confidence interval (E). A gene was considered to be differentially expressed if both adjusted P<0.05 and |log2Fold-change| >0.41 (corresponding to a change of 33%). In all cases, ns - P>0.05 = ns; * - P<0.05 = *; ** - P<0.01 = **; *** - P<0.001 = *** (two-sided Student’s t-test in (C), Fisher’s exact test in (E)).

EPB41L4A-AS1 and SUB1 depletion affect the expression of different classes of snoRNAs.
(A) Barplot depicting the normalized read density in the Ribo(-) RNA-seq data in EPB41L4A exons and introns. (B) Boxplot of gene expression changes (log2FC) after EPB41L4A-AS1 depletion for the indicated RNA categories. (C) Correspondence of the changes in gene expression after EPB41L4A-AS1 KD between the pre-intronic, post-intronic and snoRNA regions. Spearman’s correlation coefficients are also reported. (D) Normalized Ribo(-) RNA-seq signal over the intronic regions before (top) and after (bottom) the expressed snoRNAs. RT-qPCR (E) and Western blot (F) upon SUB1 depletion with siRNAs to assess KD efficiency. (G) MA plot showing the genome-wide gene expression changes after SUB1 KD with siRNAs. (H) Same as in (B), but after SUB1 KD with siRNAs. All experiments were performed in n = 3 biological replicates, with the error bars in the barplots representing the standard deviation. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann-Whitney test, and a global ANOVA p-value is also reported. A gene was considered to be differentially expressed if both adjusted P<0.05 and |log2Fold-change| >0.41 (corresponding to a change of 33%). In all cases, ns - P>0.05 = ns; * - P<0.05 = *; ** - P<0.01 = **; *** - P<0.001 = *** (two-sided Student’s t-test).

GAS5 depletion does not affect snoRNAs expression.
(A) Schematics of the GAS5 locus with tracks depicting the eCLIP peaks for both SUB1 and NPM1 (source: ENCODE), as well as the location of the two GapmeRs that were used. (B) RT-qPCR upon GAS5 depletion with GapmeRs to assess KD efficiency. (C) MA plot showing the genome-wide gene expression changes after GAS5 KD with GapmeRs. (D) Violin/boxplots of gene expression changes (log2FC) after GAS5 depletion for the indicated RNA classes. (E) Boxplot of gene expression changes (log2FC) after GAS5 depletion for the indicated RNA categories. (F) Correspondence between the changes in gene expression of the color-coded snoRNA categories after EPB41L4A- AS1 and GAS5 depletion. The trendline and Spearman correlation coefficient are also reported. All experiments were performed in n = 3 biological replicates, with the error bars in the barplots representing the standard deviation. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann- Whitney test, and a global ANOVA p-value is also reported. A gene was considered to be differentially expressed if both adjusted P<0.05 and |log2Fold-change| >0.41 (corresponding to a change of 33%). In all cases, ns - P>0.05 = ns; * - P<0.05 = *; ** - P<0.01 = **; *** - P<0.001 = *** (two-sided Student’s t-test).

Nucleolar stress induced by CX-5461 treatment affects SUB1 and NPM1 nuclear patterns.
(A) Representative immunofluorescence images for NPM1 after treating cells with the indicated CX- 5461 concentrations and time (scale bar = 20 µm). (B) Boxplot depicting the changes of the Kurtosis of the nuclear NPM1 signal after treating MCF-7 cells with the indicated CX-5461 concentrations and time. (C) Same as in (A), but for SUB1. (D) Same as in (B), but for SUB1. All experiments were performed in n = 3 biological replicates. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann- Whitney test.

EPB41L4A-AS1 affects local RNA metabolism.
(A) Changes in synthesis rate in KD cells of the genes in the two flaking TADs of EPB41L4A-AS1. The vertical dotted lines represent the TAD boundaries (as assessed by TADmap), the continuous vertical line the lncRNA locus and inter-TAD boundary, and the horizontal continuous line a log2Fold-change equal to 0. The dots represent individual genes, with the significant (adjusted P<0.05 and |log2Fold-change| >0.41) ones highlighted in red. (B) Same as in (A), but for RNA half lives. (C) Same as in (A) and (B), but for CTCF binding as assessed by CUT&RUN. Each dot and name represent a single CTCF peak and the closest gene, respectively. (D) Changes in gene expression of the p53 signature genes (from MSigDB) after KD with GapmeRs targeting either EPB41L4A-AS1 or EPB41L4A (polyA+ RNA-seq data), and SNORA13 KO cells27. The points in the boxplot represent individual genes, and their color whether they were found to be significantly dysregulated (red) or not (black). All experiments were performed in n = 3 biological replicates. In the boxplots, the thick line, edges of the box, and whiskers represent the median, first and third quartiles, and the upper and lower 1.5 interquartile ranges (IQRs), respectively. Outliers (observations outside the 1.5 IQRs) are drawn as single points, the significance of the different comparisons was computed by a Mann-Whitney test.