Figures and data

CRISPR-based screen identifies U2AF2 as a promoter of intron retention.
(A) Schematic of the PURPL IR reporter used in the screen to identify IR regulators. The pLCHKO vector contains the puromycin resistance gene used for selection. The transgene also contains the genomic sequences of exon 2 (161 bp), intron 2 (1,943 bp), and exon 3 (163 bp) of the PURPL gene under a Doxycycline-inducible promoter. A hybrid guide (hg)RNA for guiding SpCas9 and LbCas12a is expressed under a U6 promoter. Top arrow: The hgRNA is cleaved by Cas12a to produce 2 guide RNAs for dual targeting of each protein coding gene, paralog targeting, and genetic interaction mapping. Bottom arrow: Upon intron splicing or intron retention, two different reads are generated. (B) Two different cell lines, stably expressing Cas9 and Cas12a, are used: HAP1, and RPE1. The CRASP-seq pipeline until Illumina paired-end sequencing of CRISPR libraries is depicted. (C) Heatmap showing single genes only (after removing gene pairs), whether identified individually or as part of a gene pair. The hits are ordered by the average ΔPIR values across HAP1 and RPE1, from lowest to highest. (D) Left: Gene ontology enrichment analysis of biological processes regulated by the hits of the screen. The most significant pathways involve mRNA processing and RNA splicing. Right: Gene ontology enrichment analysis of cellular components regulated by the hits of the screen. These components include mainly Spliceosomal Complex, small nuclear ribonucleoprotein complex and the Catalytic step 2 Spliceosome. (E) Venn diagram showing that there are 5 common hits which cause a reduction in the percentage of intron retention (ΔPIR) of the splicing reporter in the two cell lines. (F) Table showing the top 5 hits of the screen with a ΔPIR for each hit in each cell line and their full names.

U2AF2 binds to intron 2-containing PURPL transcripts and promotes intron 2 retention.
(A) IGV snapshot of eCLIP-seq data showing binding sites and enrichment of U2AF2 and U2AF1 on the PURPL transcripts around intron 2. eCLIP-seq data for PTBP1, PRPF8, and SRSF1 are indicated. The annotated locus by RefSeq is also indicated. eCLIP data-seq was downloaded from encodeproject.org. (B) RT-qPCR after RNA-IPs using a U2AF2 antibody with primer pairs specifically detecting PURPL transcripts as indicated in Figure S1B. U2AF2 binds to transcripts containing intron 2 but not the ones with intron 1, intron 3, or spliced exons 2 and 3. Samples were normalized to IgG-IP. 18S was used as a loading control. (C) Validation of hgRNAs using the PURPL minigene in HAP1 and RPE1 cells. The gels show RT-PCR products for PURPL transcripts with specific primers for the minigene. The hgRNAs used were targeting intergenic region, or U2AF2. The schematics next to the gel indicate the expected products of the intron-retained and spliced isoforms. (D) Western blot for U2AF2 showing successful knockdown of U2AF2 protein in SKHEP1 cells. GAPDH was used as a loading control (lower panel). (E) Schematic of the PCR primer triplet used to detect intron 2 retention (red) or splicing (purple). The length for each PCR product is indicated. (F) Gel with RT-PCR products for PURPL upon knockdown of U2AF2 with 2 different siRNAs in SKHEP1 cells. The schematics next to the gel indicate the expected products of the intron-retained and spliced isoforms. Between the two expected PCR products, we observed an extra band corresponding to the inclusion of an alternative exon inside PURPL intron 2 as observed in RefSeq, the inclusion of which is not affected by U2AF2. (G) Graph with quantitation of the gel bands in (F). Error bars represent standard deviations from 3 (G) experiments. ***p<0.001.

Expression of Intron 2 Retaining PURPL leads to high proliferation.
(A) RNA-FISH images for PURPL with intron 2 retention and MALAT1 in HCT116 cells without treatment or after 24 h of 2 mM of Hydorxyurea (HU) to induce PURPL expression. Scale bar is 10 μm. (B) RNA stability assays were performed for PURPL transcripts by measuring their levels by RT-qPCR following 0, 2, and 4 h of ActD treatment in SKHEP1 cells. MYC was used as a positive control for unstable RNA. (C) RT-qPCR showing PURPL depletion in in SKHEP1 cells using 3 different gRNAs (sg2, sg4, sg5) compared to a Non-Targeting Control gRNA. (D) RT-qPCR for intron 2-containing PURPL transcript after 48 h of 1 μg/mL doxycycline treatment in comparison to no treatment in SKHEP1 PURPL-CRISPRi populations using the 3 gRNAs as in (C). (E) Proliferation assay showing the effect of overexpression of intron 2-containing PURPL transcript in the proliferation of SKHEP1 cells where the endogenous PURPL is knocked down with CRISPRi. Error bars represent standard deviation from 3 populations with different gRNAs. The cells were treated with 1 μg/mL doxycycline to induce intron 2-retained PURPL expression and cell proliferation was monitored at 3 and 6 days. (F) Gel with RT-PCR products for transcripts using primers as indicated in Fig. 2E upon doxycycline treatment of SKHEP1 CRIPSRi cells. The three repeats represent 3 different clones of cells. The last lane is an RT- control. The schematics next to the gel indicate the expected products of the intron-retained and spliced isoforms. NS indicates a non-specific band. Error bars in (B), (C), and (D) represent standard deviations from 2 experiments. *p<0.05, **p<0.01, ***p<0.001.

U2AF2 directly promotes Intron Retention of MALAT1.
(A) U2AF2 was knocked down in SKHEP1 cells and 72 h later, RNA was extracted and RNA-seq was performed. Left: Number of decreased (blue) and increased (red) IR events at various p-values after U2AF2 knockdown as analyzed with the IR Finder algorithm. The purple arrow indicates the PURPL IR event and green arrows indicate MALAT1 IR events Right: Pie chart of the numbers of increased and decreased IR events upon U2AF2 knockdown. (B) Floating bar plots showing the IR ratio of PURPL intron 2 and the IR ratio of intron 1 (middle) and intron 2 (right) of MALAT1 in siCTRL and siU2AF2#1 samples as analyzed with the IRFinder algorithm. (C) and (E) RT-PCR for MALAT1 using a primer pair flanking the regulated intron 1 (C) or intron 2 (E) upon knockdown of U2AF2 with 2 different siRNAs in HCT116 and SKHEP1 cells. The schematics next to the gel indicate the expected products of the intron-retained and spliced isoforms. (D) and (F) Bar graph with quantitation of the gel bands from (C) and (E) in SKHEP1 cells. Error bars represent standard deviations from 2 independent experiments. *p<0.05, **p<0.01, ***p<0.001.

U2AF2 promotes localization of MALAT1 to nuclear speckles.
(A) RNA-FISH images for MALAT1 and Immunofluorescence images for SON is shown upon transfection of SKHEP1 cells with siCTRL or siU2AF2 with two different siRNAs. MALAT1 is enriched in nuclear speckles in the siCTRL but not upon U2AF2 knockdown. (B) Quantitation of the speckle to nuclear plasma MALAT1 signal ratio in the three replicates in panel (A). ****p<0.0001.

Intron 2 of MALAT1 dictates its localization to nuclear speckles.
(A) RNA-FISH images for MALAT1 and Immunofluorescence images for SON in a HCT116 MALAT1-KO clone (clone 1) where Empty Vector (EV) or constructs that exogenously express MALAT1 transcripts were re-introduced using a doxy-inducible lentivirus. Expressed MALAT1 transcripts were full-length (WT) or intron 1 deleted (del-1) or intron 2 deleted (del-2) or both introns deleted simultaneously (del-1+2). The cells were first treated with 1 μg/mL doxycycline for 48 h to induce MALAT1 expression. MALAT1 without intron 2 does not enrich in speckles as it is shown with SON protein. Scale bar is 10 μm. (B) Graph showing quantification of the percentage of cells where MALAT1 transcripts are enriched in speckles or being diffused in HCT116 cells clone 1. Error bars represent standard deviations from 2 independent experiments.

Intron 2 regulates the migration capacity of MALAT1 in breast cancer cells.
(A) RNA-FISH images for MALAT1 and Immunofluorescence images for SON are shown upon transfection of MDA-MB-231 cells with siCTRL or siU2AF2. MALAT1 is enriched in nuclear speckles in the siCTRL but not upon U2AF2 knockdown. (B) Quantitation of the speckle to nuclear plasma MALAT1 signal ratio in the three replicates in panel (A). (C) RNA-FISH images for MALAT1 and Immunofluorescence images for SON in MDA-MB-231 clones either WT, or whole locus MALAT1 deletion, or intron 2 deletion. When intron 2 is deleted, MALAT1 does not enrich in speckles as it is shown with SON protein. Scale bar is 10 μm. (D) Quantitation of the speckle to nuclear plasma MALAT1 signal ratio in the three replicates in panel (C). (E) Graph showing quantification of the percentage of MALAT1 transcripts enriched in speckles or being diffused in MDA-MB-231 cells in panel (C), N=2. (F) Bar graph showing the number of cells that have migrated in transwell migration assays conducted in MDA-MB-231 WT, MALAT1 KO and MALAT1 – intron 2 deletion clones. Intron 2 deletion leads to decreased migration potential. The number of cells are the sum of cells from 5 different fields, N=4. *p<0.05, **p<0.01, ***p<0.001, ****p<0.0001.