Introduction

Cervical cancer is a leading cause of mortality in women worldwide. It is the second most diagnosed cancer among women, with an estimated 530,000 new cases worldwide each year, and the third most frequent cause of cancer-related death, accounting for 270,000 deaths annually[1]. Persistent infections in the cervical epithelium with the cancer-associated alpha human papillomaviruses (high-risk HPVs), in particular HPV16 and HPV18 with the sustained expression of viral oncoproteins, E6 and E7, contribute to squamous epithelial cell tumorigenesis [24]. Approximate 1.4% of high-grade neoplastic cervical lesions progress to invasive cervical cancer [5], while most lesions resolve spontaneously. The progression to cervical cancer typically takes 15 to 20 years after initial infection (https://www.who.int/news-room/fact-sheets/detail/cervical-cancer). These observations suggest that active host defense systems play important roles in preventing malignant transformation of neoplastic cervical lesions, and additional factors besides E6 and E7 are necessary for cervical cancer development. Indeed, one tumor suppressor network, the Fanconi anemia (FA) pathway [6], can be activated by HPV-induced DNA damage. FA proteins including FANCA, FANCD2 and FANCI are elevated upon HPV infection [79] and the activated FA pathway restricts HPV replication and guards host genome stability [10]. FANCD2 mutation stimulates HPV16 and HPV31 genome amplification and also promotes cervical and vaginal cancer development in HPV16 E7 transgenic mouse [11, 12].

HPV-positive cervical cancer tissues in general exhibits wild-type p53, wild-type K-RAS, and no overexpression of RAS genes[1316]. Viral oncoprotein E7 plays a major role in the immortalization of primary epithelial cells mainly by inactivation of pRb family of proteins [17], whereas E6 mediates degradation of tumor suppressor p53 and activation of human telomerase reverse transcriptase (hTERT) transcription [1820]. However, high-risk E6 and E7 are necessary but not sufficient to transform fully the immortalized cells into malignant cells [2125] and additional oncogenic stress is needed. For example, the HRASG12V, a constitutively activated RAS GTPase mutation, triggers malignant transformation of E6/E7-expressing keratinocytes [2629]. Normal RAS regulates cell proliferation, cell differentiation and cell adhesion through cellular signal transduction from growth-factor receptors such as insulin-like growth factors (IGFs) and insulin-like growth factor binding proteins (IGFBP1-6) [3032]. Thus, overactive RAS signaling might facilitate transformation of E6/E7-expressing keratinocytes. In transgenic mice, HPV16 E6 and E7 can promote the development of spontaneous epithelial skin tumors, but not spontaneous tumors of the reproductive tract [3336]. Prolonged estrogen treatment is required [37]. Estrogen receptor (ER) activation stimulates the mitogen-activated protein kinase (MAPK/Erk) and phosphoinositide 3-kinase (Pl3K/Akt) pathways, both of which are mediated by RAS [38]. However, the molecular mechanism underlying estrogen-induced cervical cancer in E6 and E7 transgene mice is not fully understood.

Long non-coding RNAs (lncRNAs) are RNAs over 200 nucleotides (nt) in length lacking a coding capacity for translation into functional proteins. Although several lncRNAs have been proposed to function in diverse biological processes, evidence is still lacking to support the functionality of the majority of lncRNAs [8, 39]. Nevertheless, the identified lncRNA functions include regulating: (1) transcription of neighboring or distant genes by recruiting histone modifiers, as shown for Xist [40] and ANRIL [41], or by causing chromosome looping, as proposed for PVT1 [42] and CCAT1-L [43]; (2) RNA splicing by acting as decoy of splicing factors, as demonstrated for MALAT1 [44] and PNCTR [45]; (3) mRNA stability by directly base pairing as antisense transcripts [46], by acting as RNA binding protein (RBP) decoy to prevent mRNA-RBP interaction, such is the case for NORAD [47], or by functioning as miRNA sponges to prevent miRNA-induced degradation of mRNAs, as shown for PTEN [48] and PNUTS [49]; and (4) protein stability by directly binding, such as MaIL1 [50], or by acting as protein decoy to prevent ubiquitin-mediated proteasomal degradation, such as FAST [51]. Thus, lncRNA genomic position, subcellular localization and their interactions with DNA, RNA and proteins are crucial determinants to the functionality of individual lncRNAs.

We recently reported an increased level of lnc-FANCI-2 expression in high-risk HPV-positive cervical intraepithelial neoplasia (CIN) and invasive cervical cancer (ICC) tissues [8]. At least 14 RNA isoforms of lnc-FANCI-2, which is wrongly annotated by NCBI as a MIR9-3HG in the recently updated RefSeq database (https://www.ncbi.nlm.nih.gov/refseq/), are expressed by usage of two alternative promoters , which are ∼10 kb downstream of a non-expressible miR9-3 gene [8], alternative RNA splicing, and alternative selection of RNA polyadenylation sites from the genomic locus adjacent to the FANCI on Chr5. Both lnc-FANCI-2 and FANCI are up-regulated simultaneously in neoplastic cervical lesions and cervical cancer by high-risk HPV infections. E6 and, in particular E7 are responsible for the enhanced expression of lnc-FANCI-2, the transcription of which is also regulated by YY1[8]. HPV infection increases YY1 levels but decreases the expression of p53-dependent miR-29a which targets the YY1 3ʹ UTR. Viral E7 interacts with YY1 and facilitates YY1 transactivation of lnc-FANCI-2 promoter [8]. In situ hybridization has revealed that lnc-FANCI-2 is preferentially cytoplasmic, but its function in HPV infected cells has not been characterized. In this report, we demonstrate that knockout (KO) or knockdown (KD) of lnc-FANCI-2 promotes RAS signaling and phosphorylation of Akt and Erk.

Results

Differential expression of lnc-FANCI-2 in cervical cancer tissues and its derived cell lines

As lnc-FANCI-2 expression is up-regulated along with cervical lesion progression by high-risk HPV infections [8], we further verified the increased expression of lnc-FANCI-2 in cervical cancer tissues by RNAscope single molecule RNA in situ hybridization (RNA-ISH) using a lnc-FANCI-2 antisense probe spanning over the major isoform of lnc-FANCI-2 nt 359-1713 region (GenBank MT669800.1). We found both cytoplasmic and nuclear locations of the increased lnc-FANCI-2 ISH signals within the tumor nest in HPV16-infected cervical cancer tissues, but not much so in its adjacent normal tissue areas (Fig. 1A). The increased expression of lnc-FANCI-2 became obvious in human foreskin keratinocytes with HPV16 or HPV18 infection when compared to the HFK cells without HPV infection (Fig. 1B).

Increased expression of lnc-FANCI-2 in HPV16-infected cervical cancer tissues and HPV16-and HPV18-infected raft culture tissues. (A) Expression lnc-FANCI-2 in HPV16+ cervical cancer tissues were examined by RNAscope RNA-ISH analysis. Nuclei were stained with DAPI (blue). (B) Human foreskin keratinocyte (HFK)-derived raft cultures without (HFK) or with HPV16 or HPV18 infections (HFK-HPV16 or HFK-HPV18) were examined at day 10 for lnc-FANCI-2 RNA by RNAscope RNA-ISH analysis. Scale bars: 25 μm in the original figures and 10 μm in the zoomed insets.

However, we subsequently demonstrated the high expression of lnc-FANCI-2 in four HPV16-infected cervical cancer cell lines (SiHa, CaSki, W12 20861 and W12 20863 cells), but not in two HPV18-infected cervical cancer cell lines (HeLa and C4II cells), nor other HPV-negative cell lines of HCT116 (colorectal cancer cells), BCBL-1 (body cavity B lymphoma cells), HEK293 (Ad5 E1/E2-immortalized human kidney cells), and HaCaT (spontaneously immortalized human epidermal cells) (Fig. S1A). Interestingly, the high expression of lnc-FANCI-2 was observed in a HPV-negative cervical cancer cell line C33A cells with mutations of both p53 and RB genes [52] (Fig. S1A). Northern blot confirmed the increased expression of lnc-FANCI-2 in C33A cells and no expression of lnc-FANCI-2 in HeLa cells (Fig. S1B).

Subcellular distribution of lnc-FANCI-2 was characterized by cell fractionation and Western blot assays using nuclear SRSF3 and cytoplasmic GAPDH protein as an indication of fractionation efficiency (Fig. S1C). By RT-qPCR analysis of the fractionated total RNA, we demonstrated that lnc-FANCI-2 RNA is mainly cytoplasmic in HPV16-positive CaSki cells, but nuclear in HPV16-positive SiHa cells (Fig. S1C). The differential subcellular distributions of lnc-FANCI-2 RNA from CaSki to SiHa cells were further confirmed by RNAscope RNA-ISH using the lnc-FANCI-2 antisense probe described above (Fig. S1D).

lnc-FANCI-2 regulates proliferation of HPV-transformed cervical cancer cells

lnc-FANCI-2 RNA is transcribed mainly from a proximal promoter TSS2, but also from an alternative minor, distal promoter TSS1. Two highly conserved YY1 binding motifs upstream of the TSS2 are essential for the TSS2 transcriptional activity [8] (Fig. 2A). To elucidate the function of lnc-FANCI-2 in high-risk HPV-infected cervical cancer cells, we knocked out (KO) lnc-FANCI-2 expression in HPV16-positive CaSki cells using CRISPR/Cas9 deletion of either a 3-kb promoter region encompassing both TSS1 and TSS2 or an 86-bp region containing two YY1-binding motifs in the TSS2 promoter (Fig. 2A). Two tested gRNAs with high KO efficiency were selected and cloned into a modified CRISPR/Cas9 expression vector to express both 5ʹ-specific gRNA and 3’-specific gRNA simultaneously for efficient genome editing [53]. CaSki cells stably transfected with the dual gRNA expression vector were generated. Through a serial dilution, several single cell clones were isolated and verified for homozygous deletion by PCR screening. PCR genotyping indicated successful homozygous deletion of the promoter or YY1-binding motifs, respectively (Fig. 2B).

Knockout (KO) of lnc-FANCI-2 in CaSki cells affects cell proliferation, colony formation and migration. (A) Diagram and KO strategies of lnc-FANCI-2 gene. On the figure top is the lnc-FANCI-2 gene structure and its alternative transcription start sites (TSS) and polyadenylation sites (pA). TSS2 or pA2 with a heavier arrow are predominately used for lnc-FANCI-2 expression. The lower figure part shows KO strategies. Red slashes represent gRNA-targeted sites to create genomic DNA deletions by CRISPR-Cas9 technology. Deletion of YY1-binding motifs (ΔYY1) led to a deletion of 86-bp DNA fragment containing two YY1-binding motifs. Deletion of lnc-FANCI-2 promoter (ΔPr) led to delete a ∼3.3-kb promoter region. Primers (P1-P4) used for PCR screening were shown as black arrows. (B) PCR screening of single cell clones with homozygous lnc-FANCI-2 KO, with indicated primer sets diagramed (A). Single cell clones selected from the cells transfected with an empty vector served as control (Ctrl) cells. (C) Evaluation of lnc-FANCI-2 KO efficiency in the selected single cell clones (B) by RT-qPCR. (D) Northern blot validation of lnc-FANCI-2 KO efficiency from the individual single cell clones or the parental (WT) CaSki cells on polyA+ RNA (enriched from 100 μg of total cell RNA, lanes 1-4) or 10 μg of total cell RNA (lanes 5-9). Total cell RNA (2 μg) of HEK293T cells with ectopic expression of two isoforms (short, S or long, L) of lnc-FANCI-2 cDNA served as a control. Antisense oligo probe P5 (A) labeled with 32P was used for hybridization. GAPDH RNA served as a loading control and hybridized with a GAPDH-specific oligo probe. (E-F) RNA in situ hybridization (RNA-ISH) validation of lnc-FANCI-2 KO efficiency by RNAscope technology. Two single cell clones were examined with red color for lnc-FANCI-2 and blue color for the nucleus (E). Bar graphs show the copy number of lnc-FANCI-2 per cell in the WT, ΔYY1-D5 or ΔPr-A9 CaSki cells (each averaged from 200 cells) (F). **, P<0.01; ***, P<0.001 by two tailed Student t test.

Loss of lnc-FANCI-2 expression in the single-cell clones was examined by RT-qPCR (Fig. 2C). When compared to all control clones transfected and selected from an empty vector containing no gRNA, all single cell clones with deletion of YY1-binding motifs (ΔYY1) showed 70-80% reduction of lnc-FANCI-2 expression, whereas the single cell clones with deletion of both TSS1 and TSS2 promoters (ΔPr) showed at least 90% reduction (Fig. 2C). Northern blot analysis confirmed the decrease in levels of the two major lnc-FANCI-2 isoforms (Fig. 2D), with an abundant 4-kb lnc-FANCI-2 RNA derived from a distal polyadenylation site pA2 and a less abundant 2-kb of lnc-FANCI-2 RNA derived from a proximal polyadenylation site pA1[8] (Fig. 2A and 2D). Reduced lnc-FANCI-2 RNA expression was also confirmed by RNAscope RNA-ISH (Fig. 2E and 2F). The parental (WT) CaSki cells expressed ∼10 copies of lnc-FANCI-2 RNA per cell [8], whereas the copy number of lnc-FANCI-2 RNA in the ΔYY1-D5 cells dropped to ∼5 copies and to ∼2 copies per cell in the ΔPr-A9 cells (each averaged from 200 cells).

Although the WT CaSki cells grow as cell islands, the lnc-FANCI-2 KO cells displayed a dispersed cell growth pattern and an irregular or spindle-like cell morphology which was more obvious in the ΔPr-A9 cells (Fig. S2A). The ΔPr-A9 cells were further examined for effects on HPV16 E6 and E7 expression. When compared with WT CaSki cells, the ΔPr-A9 cells showed little change in E6, E7, p53 (E6 downstream target) and E2F1 (E7 downstream target) protein levels (Fig. S2B).

lnc-FANCI-2 regulates the expression and secretion of cell soluble receptors

We considered that altered expression of cell membrane proteins and secreted factors in the lnc-FANCI-2 KO cells might contribute to the observed different cell morphology and growth properties. Therefore, we performed a Proteome Profiler Human sReceptor Array analysis to examine possible changes in expression of 105 well-characterized soluble protein receptors using total cell lysates and cell culture supernatants from ΔPr-A9 and WT CaSki cells.

Using 30% cut-off (FC -/+0.3), we identified from the ΔPr-A9 cell lysate nine proteins with increased expression compared to the WT CaSki cells, including PODXL2, ECM1, NECTIN2, MCAM, ADAM9, ADAM10, CDH5, ITGA5 and NOTCH1, and six proteins with decreased expression including ITGB6, CDH13, LGALS3BP, TIMP2, ADMA8 and SCARF2 (Fig. 3A and 3B). We also found from the ΔPr-A9 cell culture supernatant five proteins with increased expression, including ADAM9, NECTIN2, ADAM10, CDH5 and ECM1, and the decreased expression of four proteins, including CRELD2, SDC1, SDC4 and TIMP2 (Fig. 3A and 3B). By immunoblot assays, we verified selectively the increase in PODXL2, MCAM, and ECM1 and the decrease in ADAM8 and TIMP2 from the ΔPr-A9 cell lysates and the increase in ECM1 and the decrease in TIMP2 in the ΔPr-A9 cell culture supernatant (Fig. 3C). As ADAM8 is proteolytically processed into two protein isoforms [54], we confirmed by immunoblot the decreased expression of all three sizes of ADAM8 protein in the ΔPr-A9 cell lysate over the WT CaSki cells (Fig. 3C).

KO of lnc-FANCI-2 in CaSki cells affects expression of cellular soluble receptors. (A) Dot blots shows differentially expressed soluble receptors in ΔPr-A9 cells when compared to those in the WT CaSki cells. A Proteome Profiler Human sReceptor Array was used to examine 105 cellular soluble receptors. Label in red, upregulated soluble receptors; label in blue, downregulated soluble receptors. (B) Quantification of differentially expressed and/or released soluble receptors (A) in ΔPr-A9 cells when compared to those in the WT CaSki cells. Error bar represents standard deviation of replicates in each proteome array. The dash line indicates the threshold of fold change (FC) with a score above or below the threshold (-/+ 0.3 FC) being determined as differentially expressed and/or released soluble receptors. (C) Validation of several differentially expressed soluble receptors by immunoblot analysis using specific antibodies. Tubulin served as an internal loading control. * non-specific band in MCAM immunoblot.

lnc-FANCI-2 regulates the expression of genes involved in RAS signaling

We next conducted genome-wide RNA-seq analyses of WT and lnc-FANCI-2 KO cells (four samples/group) to determine the transcriptomic consequences of lnc-FANCI-2 deficiency. We obtained ∼120 million mappable RNA reads to the human reference genome hg38 from each sample. Analysis of RNA-seq reads-coverage map by Integrative Genomics Viewer (IGV) showed ∼90% reduction of lnc-FANCI-2 expression in ΔPr-A9 cells and ∼72% decrease in ΔYY1-D5 cells compared to the WT CaSki cells (Fig. 4A), consistent with the results shown in Fig. 2C-F.

Transcriptomic effect of lnc-FANCI-2 KO in CaSki cells by RNA-seq analysis. (A) RNA-seq reads-coverage maps by IGV showing the expression levels of lnc-FANCI-2 from two KO cell clones, ΔPr-A9 and ΔYY1-D5, to the WT CaSki cells. one representative coverage profile of four in each type of cells is shown by IGV. Red slashes represent the deleted genomic region, with validated fourteen lnc-FANCI-2 RNA isoforms shown below [8]. (B) Similarity in heatmap comparison among ΔPr-A9, ΔYY1-D5 and WT cells was generated using Limma-normalized counts from each sample by Pearson complete linkage method. (C) Volcano plot visualization of differentially expressed genes (DEGs) in ΔPr-A9 and ΔYY1-D5. The genes with the most significant change in the increased (red) or decreased (blue) expression are indicated. FC, fold change. (D) Venn diagram depicting the overlapped DEGs between ΔPr-A9 and ΔYY1-D5 cells over the WT CaSki cells. (E) Heatmap shows overlapped DEGs with FPKM ≥7 in ΔPr-A9 and ΔYY1-D5 cells when compared with the WT CaSki cells. (F) Validation of selective 12 upregulated or downregulated DEGs shown in the heatmap by RT-PCR with gene-specific primers in the presence (+) or absence (-) of reverse transcriptase (RT). GAPDH served as an internal RNA control. Relative expression level of each gene was calculated based on band density after normalizing to GAPDH RNA band, with the expression level in the WT CaSki cells setting as 1. (G) Validation of the upregulated expression of IGFBP3 in ΔPr-A9 cells over the WT CaSki cells by immunoblot analysis.

Hierarchical clustering analysis showed more global transcriptional similarity between ΔYY1-D5 and ΔPr-A9 cells (Fig. 4B). By applying a threshold of fold change (FC) ≥ 1.8 or FC ≤ -1.8 with FDR ≤ 0.01, we found 1230 genes in ΔPr-A9 and 797 genes in ΔYY1-D5 cells with significantly differential expression relative to the WT CaSki cells expressing ∼15890 genes. The most significantly affected genes are shown in Volcano plots (Fig. 4C and Table S1). Among these, 211 were upregulated and 189 were downregulated in both ΔPr-A9 and Δ YY1-D5 cells (Fig. 4D and Table S2). By applying more stringent criteria with FPKM ≥7 in at least one of four RNA-seq samples as a cutoff, we profiled the genes with large expression differences in both ΔPr-A9 and ΔYY1-D5 cells. The results displayed in the heatmap of Fig. 4E included 52 upregulated, and 47 downregulated genes excluding lnc-FANCI-2. Subsequently, we selectively verified by RT-PCR the decreased expression of ITGB6, NLRP2, PLAC8, PSG4, PSMB9, and SERPINB1 and increased expression of CFH, CNTN5, EMP3, IGFBP3, PTGS2, and SERPINB2 in ΔPr-A9 cells (Fig. 4F), as well as the increased expression of IGFBP3 protein, a RAS signaling driver [3032], in ΔPr-A9 cells by immunoblot (Fig. 4G).

By performing the Gene Set Enrichment Analysis (GSEA) on the Hallmark gene sets, which provide more refined and concise inputs for GSEA [55], we found the most significantly upregulated pathways in both ΔPr-A9 and ΔYY1-D5 cells, when compared with the WT CaSki cells, were KRAS signaling and epithelial mesenchymal transition (EMT). The most significantly downregulated pathways in both ΔPr-A9 and ΔYY1-D5 cells over the WT CaSki cells were interferon gamma (IFN-γ) and interferon alpha (IFN-α) responses (Fig. 5A). GSEA plots for ΔPr-A9 shows 29 of 111 genes, including IGFBP3 [3032], in RAS signaling and 39 of 130 genes in EMT (Fig. 5B and 5C, Table S3) were significantly upregulated, whereas 37 of 139 genes in IFN-γ response and 24 of 71 genes in IFN-α response were significantly downregulated (Fig. 5D and 5E, Table S3). Similar enriched gene sets with increased RAS signaling and EMT and decreased IFN-γ and IFN-α responses were observed in ΔYY1-D5 cells (Fig. S3, Table S3).

Pathway analyses of DEGs identified by RNA-seq. (A) Top 3 upregulated and top 4 downregulated pathways in ΔPr-A9 and ΔYY1-D5 cells over the WT CaSki cells. Data were generated by Gene set enrichment analysis (GSEA) performed with Hallmark gene sets. NES stands for normalized enrichment score. (B-E) Each GSEA Enrichment plot shows the enrichment score and gene hits enriched in ΔPr-A9 cells using Hallmark gene sets, KRAS_SIGNALING_UP (B), EPITHELIAL_MESENCHYMAL_TRANSITION (C), INTERFERON_GAMMA_ RESPONSE (D) or INTERFERON_ALPHA_RESPONSE (E). padj, adjusted p value; Zero crossed 7342, the middle number of total genes in the GSEA at ranking. The heatmaps below the enrichment plots (B and C) visualize the genes enriched in respective pathways of KRAS_SIGNALING_UP (B) and EPITHELIAL_MESENCHYMAL_TRANSITION (C) in ΔPr-A9 cells when compared to the WT CaSki cells.

Since ΔPr-A9 cells exhibit higher KO efficiency of lnc-FANCI-2 and a higher number of differentially expressed genes (DEGs) and show more severe phenotypic changes in cell proliferation, migration, and colony formation than ΔYY1-D5 cells, the ΔPr-A9 cells were primarily used for further focused studies on RAS signaling in this report.

lnc-FANCI-2 regulates RAS GTPase activities to phosphorylate RAS signaling effectors Akt and Erk

RAS activation triggers two major downstream signal transduction pathways, Raf/Mek/Erk and PI3K/Akt (Fig. S4), to transduce signals from extracellular stimuli to the cell nucleus where specific effector genes are activated for their corresponding functions [3032, 56, 57]. To further confirm the GSEA data and verify that two major signaling pathways of RAS are activated, we first examined and compared the RAS GTPase activities in ΔPr-A9 cells and the WT CaSki cells. We found a significant increase in RAS GTPase activity in ΔPr-A9 cells (Fig. 6A, top bar graphs), indicating that the endogenous lnc-FANCI-2 RNA in CaSki cells suppresses not only the expression of IGFBP3 (Fig. 4F and 4G), the most abundant IGFBP, but also RAS activation (Fig. 6A, top bar graphs). Since ΔPr-A9 cells had undergone long term selection and adaption during single cell screening, the increased RAS GTPase activity might result from selection pressure for cell survival. However, by transient siRNA knockdown (KD) of lnc-FANCI-2 expression in the WT CaSki cells, we obtained a similar, albeit weaker, increase of RAS GTPase activity (Fig. 6A, lower bar graphs). This result suggests that the increased RAS GTPase activity in ΔPr-A9 cells was unrelated to the persistent selection pressure, but rather the deficiency of lnc-FANCI-2.

CaSki cells with lnc-FANCI-2 KO exhibit activation of RAS signaling pathway. (A) CaSki cells with lnc-FANCI-2 KO (top, ΔPr-A9 cells) or knockdown (KD, lower) by lnc-FANCI-2-specific siRNAs display increased RAS GTPase activity when compared to the WT CaSki cells (top) or the WT cells treated with a nonspecific control siRNA (siNS). The relative RAS GTPase activity was measured by a RAS GTPase Chemi ELISA assay. *, P< 0.05; ***, P<0.001 by two tailed Student t test. (B and C) Selective validation of increased expression of RAS signaling-related downstream genes in lnc-FANCI-2 KO ΔPr-A9 cells (B) or lnc-FANCI-2 KD CaSki cells (C) over the WT CaSki cells. Relative expression of the indicated genes in ΔPr-A9 cells in comparison to the WT cells (B) or in the WT CaSki cells treated for 48 h and 96 h with a nonspecific control siRNA (siNS) or a siRNA specifically targeting the lnc-FANCI-2 exon 3 (C) were examined by immunoblot analyses using corresponding antibodies as indicated. Tubulin served as an internal control for each blot. The level of p-Akt or p-Erk was calculated by normalizing to total Akt or Erk protein and the other proteins by normalizing to tubulin. The protein level in the WT cells or WT cells treated with siNS set as 1. The KD efficiency of lnc-FANCI-2 RNA (C, top bar graphs) was examined by RT-qPCR. (D) The effect of blocking RAS signaling on the expression of MCAM and VIM using PI3K inhibitor LY294002 (20 μM) and MEK inhibitor U0126 (10 μM). The protein at each time point was examined by immunoblot analysis with a corresponding antibody. The level of MCAM or VIM was calculated after normalizing to tubulin. The protein level in ΔPr-A9 cells without the inhibitor was set as 100%. *, nonspecific protein band. (E) The time dependent cell viability of the WT CaSki and ΔPr-A9 cells in the presence of 20 μM LY294002 or 10 μM U0126. Data was obtained in each time point after normalizing to the cells treated with DMSO. The mean + SD at each data point was calculated from 6 samples combined from two independent experiments.

Next, we examined activation of the PI3K/Akt and Raf/Mek/Erk pathways [5860]. We observed a 3-fold increase in phosphorylated Akt (p-Akt) and 2.5-fold increase of p-Erk1/2 (p44/p42 MAPK), after being normalized to total Akt and Erk levels, in ΔPr-A9 cells over the WT CaSki cells (Fig. 6B). These results were confirmed in ΔYY1-D5 cells (Fig. S5A). The increased p-Akt and p-Erk (mostly p-Erk2/p42) was accompanied by elevated expression of MCAM, VIM, and CCND2, and decreased expression of RAC3 (Fig. 6B). These randomly chosen potential RAS effector proteins facilitate membrane receptor function, cell proliferation, and EMT. Moreover, transient siRNA knockdown of lnc-FANCI-2 in the WT CaSki cells also led to the increased levels of p-Akt and p-Erk (Fig. 6C) and increased expression of MCAM and VIM at 48 h and 96 h post-transfection (Fig. 6C). These observations indicate that a transient loss of lnc-FANCI-2 in CaSki cells is sufficient to trigger RAS signaling. The same siRNA transfection in HeLa cells, a lnc-FANCI-2 negative cell line (Fig. S1A and S1B), exhibited no effect on p-Akt or p-Erk1/2 levels (Fig. S5B), but in SiHa cells containing mainly nuclear lnc-FANCI-2 (Fig. S1C and S1D) enhanced the expression of p-Akt and p-Erk1/2 (Fig. S5B), verifying the specific effect of lnc-FANCI-2 depletion on RAS signaling pathways independently of cellular distribution of predominant lnc-FANCI-2. The lnc-FANCI-2 KO-mediated increase in p-Akt and p-Erk was sensitive to PI3K inhibitor LY294002 [61] (Fig. 6D, lanes 4-6) and MEK1/2 inhibitor U0126 [62] (Fig. 6D, lanes 7-9), respectively. The inhibitors also decreased the expression of VIM and MCAM and cell proliferation, in particular by MEK1/2 inhibitor U0126 (Fig. 6D and 6E). The effects of LY294002 on cell proliferation were similar from ΔPr-A9 to the WT CaSki cells.

lnc-FANCI-2 inhibits the expression of IGFBP3 and MCAM (CD146 or MUC18)

Given that IGFBP-3 is the most abundant IGFBP among all six IGFBP members in potentiation of IGF action and PI3K/AKT activities [30] and a reduced expression of IGFBP-3 mRNA level is associated with progression to cervical cancer [63], we further investigated the effect of lnc-FANCI-2 on the expression of IGFBP3 as lnc-FANCI-2 effector gene. As shown in Fig. 4F-G and Fig. S6A, the increased RNA and protein expression of IGFBP3 appears in the lnc-FANCI-2 KO ΔPr-A9 cells over the parental WT CaSki cells, indicating a suppressive effect of lnc-FANCI-2 on IGFBP3 expression. This suppressive function of lnc-FANCI-2 was further confirmed in lnc-FANCI-2 rescue experiments in the ΔPr-A9 cells (Fig. 7A and 7B). By transient expression of one major isoform of lnc-FANCI-2 RNA (a-PA2, GenBank: MT669800.1) [8] in the ΔPr-A9 cells, lnc-FANCI-2 in red and IGFBP3 RNA in green at 24 h post transfection were detected by RNAscope RNA-ISH using each specific antisense RNA probe (Fig. 7A). We demonstrated that the cells with rescued expression of cytoplasmic lnc-FANCI-2 RNA displayed much reduced expression of cytoplasmic IGFBP3 RNA (Fig. 7B).

lnc-FANCI-2 is suppressive to the expression of IGFBP3 and MCAM. (A and B) Transient rescue expression of lnc-FANCI-2 in Δpr-A9 cells inhibits expression of IGFBP3. ΔPr-A9 cells were transfected with a major isoform lnc-FANCI-2a-PA2 (GenBank ACC. No. MT669800.1) cDNA plasmid. lnc-FANCI-2 in red and IGFBP3 RNA in green were detected by RNAscope RNA-ISH at 24h post transfection using each specific antisense RNA probe and imaged by confocal microscopy (A). Expression levels of lnc-FANCI-2 RNA and IGFBP3 RNA in four neighboring cells (B) were measured by signal intensity of a line crossing over the stained cells in A (white line arrow). (C) Validation of differential expression of MCAM RNA in ΔPr-A9 cells by RT-PCR in the presence (+) or absence (-) of reverse transcriptase (RT). One pair of primers with one primer at the exons 11 and the other at exon 13 of MCAM RNA (NM_006500) were used. GAPDH served as a loading control. (D) The major isofrom lnc-FANCI-2a-PA2 repressed the expression of MCAM. ΔPr-A9 cells were transfected with lnc-FANCI-2a-PA2 cDNA plasmid. lnc-FANCI-2 RNA in red was detected by RNAscope ISH and MCAM protein in green was detected by IF with an anti-MCAM antibody. (E) Expression levels of lnc-FANCI-2 RNA and MCAM protein in three neighboring cells were measured by signal intensity of a line crossing over the stained cells (D, white line arrow). (F) Calculation of the expression levels of MCAM in lnc-FANCI-2 positive cells (n=35) and lnc-FANCI-2 negative cells (n=54) by fluorescent intensity. (G) Subcellular MCAM distributions in the nucleus (N) and cytoplasm (C) of WT CaSki cells and ΔPr-A9 by immunoblot analysis. Fractionation efficiency and sample loading were controlled by cytoplasmic (represented by tubulin) and nuclear (represented by hnRNP C1/C2) proteins. (H-I) Correlation and survival analysis of lnc-FANCI-2 and MCAM expression with cervical squamous cell carcinoma (CESC) cases from the cancer genome atlas (TCGA) datasets by GEPIA web server (http://gepia.cancer-pku.cn/). The negative correlation at R=-0.321 (with p-value=1.02e-08) of Inc-FANCI-2 with MCAM in cervical cancer patients was obtained from the RNA-seq data from the TCGA for CESC tumor type downloaded from the TCGA data portal (https://portal.gdc.cancer.gov/). Only primary solid tumor samples (n=304 after exclusion of 2 metastatic samples and 3 normal samples) were subjected to analysis, with the data showing as a Scatter plot (H). Kaplan-Meier plot with log rank test p-value at 5.74e-05 (I) shows MCAM a biomarker for poor prognosis of cervical cancer survival with a lower quartile group cutoff. RNA-seq and survival data are derived from the TCGA CESC cancer patients. (J/K) AKT and ERK phosphorylation is partially regulated by MCAM and IGFBP3 in CaSki cells. KD of MCAM (J) and IGFBP3 (K) expression in WT parental CaSki (left) or ΔPr-A9 (right) cells was performed by treatment of MCAM siRNA or IGFBP3 siRNA along with or without lnc-FANCI-2 siRNA for 48 h. Expression levels of individual proteins as indicated were examined by immunoblot analyses using the corresponding antibodies. The level of MCAM, p-Akt, or p-Erk1/2 was calculated after normalizing to tubulin. The protein level in a non-targeting siRNA (siNS) control cells was set as 1.

As shown in Fig. 4E and Fig. S6A, the increased RNA expression of MCAM (CD146 or MUC18) [64] appears to be transcriptional or posttranscriptional in both ΔPr-A9 and ΔYY1-D5 cells. We confirmed by RT-PCR the lnc-FANCI-2 KO-mediated increase of MCAM RNA expression in ΔPr-A9 cells (Fig. 7C). A rescue experiment was further performed by transient expression of one major isoform of lnc-FANCI-2 RNA (a-PA2, GenBank: MT669800.1) [8] in ΔPr-A9 cells to determine both nuclear and cytoplasmic MCAM expression [65, 66] in individual lnc-FANCI-2 expressing cells. ΔPr-A9 cells ectopically expressing diffused cytoplasmic lnc-FANCI-2 RNA, as detected by RNAscope RNA-ISH, exhibited a marked reduction in nuclear MCAM protein (Fig. 7D-7E). Quantitative analyses of the ΔPr-A9 cells with or without transient lnc-FANCI-2 RNA expression showed significant reduction of nuclear MCAM protein in the 35 cells with rescued lnc-FANCI-2 expression when compared to the 54 cells expressing no lnc-FANCI-2 (Fig. 7F). Due to RNAscope procedures, the membrane-bound and cytoplasmic MCAM protein was mostly removed by protease III treatment when RNA-ISH was performed and thus only the nuclear signal of MCAM protein remained was detectable by IF staining. Using cell fractionation, we detected a cleaved MCAM of ∼46 kDa mainly in the nuclear fraction by Western blot (Fig. 7G). The increased nuclear MCAM expression in ΔPr-A9 cells was proportional to the cytoplasmic and total MCAM levels when compared with the WT CaSki cells (Fig. 7G). The nuclear MCAM signal appeared as a proteolytic cleavage product presumably by the increased metalloprotease ADAM9 and ADAM10 proteins and decreased metalloprotease inhibitor TIMP2 (Fig. 3), as observed in other studies [66].

The significance of the inverse correlation of lnc-FANCI-2 and MCAM in CaSki cells was further investigated by analyzing their expression in 304 cervical cancer samples from the TCGA dataset. There was a significant negative correlation in lnc-FANCI-2 (LINC00925 being wrongly assigned by NCBI)[8] and MCAM RNA levels (Fig. 7H). The opposite effects of lnc-FANCI-2 and MCAM on cervical cancer survival (Fig. 7I and Fig. S7) show that the cervical cancer patients with a higher level of lnc-FANCI-2 [8] but a lower level of MCAM in the cervical tissue (Fig. S7 and Fig. 7I) exhibited a better survival prognosis.

Regulatory roles of lnc-FANCI-2-mediated increase of MCAM and IGFBP3 on RAS signaling

To further dissect the lnc-FANCI-2-associated expression of MCAM and IGFBP3 on RAS signaling, we examined MCAM and IGFBP3 on phosphorylation of PI3K/Akt and Erk1/Erk2 in CaSki cells in the presence or absence of lnc-FANCI-2. As shown in Figure 7J, we found, although KD or KO of lnc-FANCI-2 promotes phosphorylation of Akt and Erk (Fig. 7J, lane 2), that KD of MCAM expression in WT parental CaSki cells in the absence of lnc-FANCI-2 (Fig. 7J, lane 3) or ΔPr-A9 cells (Fig. 7, lane 5) led to reduction of Akt and Erk phosphorylation. Data suggest that MCAM is a lnc-FANCI-2 effector, but also could be a trigger of signal transduction as reported [64, 67].

IGFBP3 protein has been viewed as a RAS signaling regulator [3032], but recent studies show that IGFBP3 has a variety of intracellular ligands involved in many unexpected functions[68, 69]. Thus, KD or KO of lnc-FANCI-2-mediated IGFBP3 expression on RAS signaling in WT parental CaSki cells or ΔPr-A9 cells was examined by Western blot after siRNA KD of IGFBP3 expression. As shown in Figure 7K, KD of IGFBP3 expression was found to increase phosphorylation of Erk1/2 by ∼70% (lane 4) to ∼60% (lane 6), but not much so for Akt phosphorylation (lanes 4 and 6). Instead, KD of IGFBP3 expression could prevent KD of lnc-FANCI-2 to enhance Akt phosphorylation in WT parental CaSki cells (Fig. 7K, compare lane 2 and lane 3). Data suggesting a separate role of IGFBP3 on phosphorylation of Erk1/2 from Akt in the presence or absence of lnc-FANCI-2.

MAP4K4 association with lnc-FANCI-2 RNA in CaSki cells regulates RAS signaling and phosphorylation of Akt and Erk

We speculated lnc-FANCI-2 restriction on RAS signaling through interactions with cellular proteins. Subsequently, comprehensive identification of lnc-FANCI-2-binding proteins in the WT CaSki cells was performed by an IRPCRP protocol (modified from published ChIRP [70]) in combination with mass spectrometry (Fig. 8A). In this protocol, the lnc-FANCI-2-binding proteins in the WT CaSki cells were covalently crosslinked to lnc-FANCI-2 RNA via UV irradiation and pulled down by pooled 30 antisense oligos crossing over the entire lnc-FANCI-2. The RNA extracted from the pulldowns were verified to be lnc-FANCI-2 specific by RT-PCR in the absence (-) or presence (+) of reverse transcriptase (RT) (Fig. 8B). The lnc-FANCI-2-binding proteins in the pulldowns were then subjected to LC-MS/MS analyses. We identified 32 specific lnc-FANCI-2-binding proteins, including H13, HNRH1, K1H1, MAP4K4, and RNPS1 (Fig. 8C, Table S4).

lnc-FANCI-2 interacts with host factors to regulate RAS signaling pathway. (A) The lnc-FANCI-2-associated proteins in the WT CaSki cells were identified by isolation of RNA-protein complexes using RNA purification (IRPCRP)-Mass spectrometry technology. (B) lnc-FANCI-2 RNA in the IRPCRP-1 and IRPCRP-2 pulldowns had the pooled antisense biotinylated oligos (pool 1 with oligos in even numbers and pool 2 with oligos in odd numbers) immobilized to avidin-beads first before mixed with cell lysates, while the IRPCRP-3 and IRPCRP-4 pulldowns had the oligos pool 1 and 2 separately mixed with cell lysates first before addition to the avidin-beads for the RNA pull-downs. RT-PCR in the absence (-) or presence (+) of reverse transcriptase (RT) was carried out using the RNA isolated from the individual IRPCRP experiments using a primer pair of oHBL5 and oHBL12 (Table S5) specific for lnc-FANCI-2 RNA detection. Beads only (no oligos) IRPCRP experiments served as a negative control. Total RNA from the WT CaSki cells after sonication was used as an input control. Arrow indicates the detected lnc-FANCI-2 RNA. (C) Proteins associated with lnc-FANCI-2 RNA identified from lnc-FANCI-2 IRPCRP pulldowns. A total of 32 proteins were specifically pulled down from lnc-FANCI-2 IRPCRP reactions 1-4 (PSM≥2 from two separate pulldowns, Table S4), with top 10 proteins binding lnc-FANCI-2 shown in the order by the number of identified PSM. (D) Expression of p-Akt and p-Erk from CaSki WT cells 48 h after siRNA-mediated KD of MAP4K4 or lnc-FANCI-2 was immunoblotted by the corresponding antibodies. GAPDH served as a sample loading control. Fold change of the indicated proteins in the cells with KD of MAP4K4 or lnc-FANCI-2 over the cells treated by a non-targeting siRNA (siNS) was calculated after normalizing to GAPDH. (E) A proposed model illustrates how lnc-FANCI-2-protein complexes inhibits the RAS signaling pathway to control Akt/ErK phosphorylation and expression of host genes. In the absence of lnc-FANCI-2 in CaSki cells, RAS signaling can be triggered by extracellular stimuli, such as IGFBP3, MCAM ligands, etc. As a result, phosphorylation of AKT and Erk leads to cascaded responses of transcription factors (TFs) to regulate the expression of RAS signaling responder genes, such as IGFBP3, MCAM, etc.

MAP4K4, a serine/threonine protein kinase, stimulates cancer cell proliferation, invasion and migration, and is recently characterized as a novel MAPK/ERK pathway regulator [7173] and negatively regulates RAS signaling by binding to Ras p21 protein activator 1 or RASA1 [74, 75]. Thus, whether MAP4K4 in association with lnc-FANCI-2 could regulate PI3K/Akt signal transduction was investigated in the WT CaSki cells. We demonstrated that siRNA knockdown of MAP4K4 expression in the WT CaSki cells, as seen for KD of lnc-FANCI-2, led to ∼40% or more increase of phosphorylation of both Akt and Erk1/2 (Fig. 8D, compare lane 2 to lane 3). The data suggest that, through RNA-protein interactions, the increased lnc-FANCI-2 RNA in cells and their association with MAP4K4 and other cellular proteins could be one arm to control RAS signaling and gene expression of its effectors (Fig. 8E).

Discussion

HPV oncoproteins E6 and E7 are necessary but not sufficient for development of cervical cancer. Indeed, early studies indicated that other oncogenic stress, such as activated RAS mutant, is needed to trigger malignant transformation of E6 and E7 expressing cells [21, 26, 76, 77] and tumorigenesis [27, 28, 7882]. Moreover, E7 has been also reported to repress phosphorylation of Akt and Akt mediated signaling [83]. In this study, we show that lnc-FANCI-2, whose expression is highly dependent on E7 and YY1 [8], intrinsically restricts RAS GTPase activities and Akt and Erk phosphorylation presumably by interacting with cellular factors, in HPV16-positive CaSki cells. Interestingly, early reports indicated that p-Akt attenuation by E7 could be abolished by introduction of H73E mutation in the E7 CR3 domain [83]. This E7 CR3 domain is also essential to interact with transcription factor YY1 to activate lnc-FANCI-2 transcription [8]. Thus, it would be presumable that the reported p-Akt attenuation by E7 might be mediated by the increased expression of lnc-FANCI-2. More importantly, these findings also suggest that additional oncogenic stress, such as activated RAS mutant, is required to overcome the lnc-FANCI-2 restriction on RAS signaling for the early observed malignant transformation [21, 26, 76, 77] and tumorigenesis [27, 28, 7882] of E6 and E7 expressing cells. In CaSki cells, loss of lnc-FANCI-2 in this report was found to promote RAS signaling but reduction of IFN responses.

Some lncRNAs function as RAS regulators [84, 85] by their ability to sequester RAS-targeting miRNAs, including MALAT1/ miR-217, RMRP/miR-206, and KRAS1P/miR-143/let-7 [85]. Conceivably, lncRNAs may also regulate RAS signaling through other mechanisms beyond miRNAs. In this report, loss of lnc-FANCI-2 in HPV16-positive CaSki and SiHa cells promotes RAS signaling and expression of RAS signaling effectors (Fig. 8E). By profiling the proteins associated with lnc-FANCI-2 in CaSki cells, we identified MAP4K4, a serine/threonine protein kinase and a negative RAS signaling regulator in the context of the early embryo [72, 74] and lymphatic vascular development by interacting with RASA1[75] as one of major lnc-FANCI-2-binding proteins that may partially mediate the lnc-FANCI-2 restriction on RAS signaling. Indeed, silencing of MAP4K4 in CaSki cells led to increase of both p-Akt and p-Erk1/2 as reported [74, 75]. Thus, our data suggest that MAP4K4 protein functions coordinately with lnc-FANCI-2 RNA in the CaSki cells to restrict RAS signaling.

Although all HPV16-positive cervical cancer and pre-cancer cell lines examined, such as CaSki, SiHa, W12 subclone cells 20861(HPV16+, integrated) and 20863 (HPV16+, episomal), produce predominantly cytoplasmic or nuclear lnc-FANCI-2, unexpectedly, we found not all cervical cancer cell lines, including HPV18-positive HeLa and C4II cells, express lnc-FANCI-2, nor colorectal cancer cell line HCT116, adenovirus 5 DNA-transformed human embryonic kidney cell line HEK293, spontaneously transformed aneuploid immortal keratinocyte cell line HaCaT, and KSHV-infected B cell lymphoma cell line BCBL-1 cells which are all HPV-negative cell lines. It remains to know what negatively regulates lnc-FANCI-2 expression in those cell lines and more HPV18-positive cell lines should be examined for lnc-FANCI-2 expression. However, we did find that a HPV-negative cervical cancer cell line, C33A cells expressing mutant p53 and mutant pRB [52], produce a high level of lnc-FANCI-2 (Fig. S1A and S1B). In the lnc-FANCI-2 producing cells, we noticed that cellular location of lnc-FANCI-2 also varies from cell types, with lnc-FANCI-2 predominantly in the nucleus in SiHa cells, but in the cytoplasm in CaSki cells, although both contribute to regulate RAS signaling. This suggests that a variety of nuclear and cytoplasmic functions of lnc-FANCI-2 remains to be explored.

IGFBP3 is a major IGF carrier and enhances both IGF-and EGF-RAS signaling [31, 86] to activate phosphatidylinositol 3-kinase (PI3K)/AKT and the mitogen-activated protein kinase (MAPK)/ERK pathways [87]. Surprisingly, we find a suppressive role of IGFBP3 on Erk phosphorylation, but not on Akt, in the presence or absence of lnc-FANCI-2 in HPV16-positive CaSki cells. Although ∼99% of circulating IGF-1 is bound to IGFBPs, predominantly to IGFBP-3, which is the most abundant IGFBP in human serum, case-control studies showed significantly lower IGF-1 and IGFBP-3 serum levels in patients with invasive cervical cancer over the control group [88, 89]. A reduced expression of IGFBP-3 mRNA level appears to be associated with progression to cervical cancer [63]. Consistently with lnc-FANCI-2’s suppressive effect on IGFBP3 expression in our study, an increased level of lnc-FANCI-2 expression is associated step-wisely with high-risk HPV-positive cervical intraepithelial neoplasia and invasive cervical cancer tissues [8].

It is obvious from our RNA-seq and human soluble receptor array analysis that KO of lnc-FANCI-2 in CaSki cells affects the expression of a large set of genes responsible for RAS signaling, EMT pathway, and IFN responses. In addition to the notified activation of PI3K/Akt and Raf/Mek/Erk pathways, one of the major RAS signaling effectors in lnc-FANCI-2 KO or KD cells was MCAM. MCAM binds various cellular surface receptors or co-receptors to trigger signal transduction, cell proliferation and motility, tumor angiogenesis, and tumor cell metastasis [64, 67]. It is an integral, highly glycosylated membrane protein with three common protein isoforms [64, 90, 91]. Two membrane-anchored forms with a long (p113) or short (p76) cytoplasmic tail produced by alternative RNA splicing and a proteolytically cleaved soluble form without transmembrane and cytoplasmic regions [64, 91, 92]. In addition to its roles in cell signal transduction [64, 67, 90, 91], the short isoform produced by an intramembrane cleavage may be directed towards the nucleus to regulate gene transcription [66]. We found that CaSki cells express mainly a cytoplasmic form of full-length MCAM (∼113 kDa) and a small nuclear form (∼46 kDa), but not the short ∼78 kDa isoform nor the soluble product. Moreover, lnc-FANCI-2 KO cells exhibit high MCAM levels, which was most likely a result of the increased RAS signaling. Consistently, lnc-FANCI-2 and MCAM levels are negatively correlated in cervical cancer patients, and low levels of MCAM and high levels of lnc-FANCI-2 indicate better prognosis. This suggests that increased MCAM might promote cervical tumorigenesis in vivo. Given that the cells with Inc-FANCI-2 KO exhibit increased levels of IGFBP3 and MCAM and the ΔPr-A9 cells with rescued lnc-FANCI-2 RNA expression displayed remarkable reduction of IGFBP3 and MCAM expression, our current model (Fig. 8E) highlights how the HPV16-enhanced expression of lnc-FANCI-2 in cervical cancer cells might function as a negative regulator by binding to various RNA-binding proteins to block RAS signaling and reduce the expression of RAS signaling effectors such as IGFBP3 and MCAM, etc.

In conclusion, this study provides the first evidence that one of lnc-FANCI-2 functions is to maintain epithelial functional integrity by repressing RAS signaling in preventing malignant transformation of neoplastic cervical lesions. In addition, more extensive studies are undergoing to elucidate the role of lnc-FANCI-2 in regulation of IFN responses, which are not covered in this report. Since transactivation of lnc-FANCI-2 expression is largely dependent on host transcription factor YY1 of which expression is regulated by HPV oncoproteins and host miR-29a, dissecting their transcription network on lnc-FANCI-2 and its effectors may further provide novel insights involved in host homeostasis against HPV infection-induced carcinogenesis. Our observations also highlight additional, unexpected players beyond viral E6 and E7 that drive cells toward malignant transformation.

Materials and methods

Cell cultures and treatment

CaSki, a HPV16-positive (HPV16+) human cervical cancer cell line with wild-type p53, pRb, HRAS, KRAS, and NRAS (https://depmap.org/portal/cell_line/ACH-001336?tab=mutation) was obtained from the American Type Culture Collection (ATCC, Manassas, VA), maintained in Dulbecco’s modified Eagle medium (DMEM) (Thermo Fisher Scientific) supplemented with 10% fetal bovine serum (FBS) and 1% penicillin-streptomycin, and grown in a cell culture incubator at 37°C with 5% CO2.

SiHa (HPV16+), HeLa (HPV18+), C4II (HPV18+), C33A (no HPV), and other HPV-negative cell lines, HEK293T (Ad5 E1/E2-immortalized human kidney cells), and HaCaT (spontaneously immortalized human epidermal cells), were all obtained from ATCC and grown in Dulbecco’s modified Eagle’s medium (DMEM) with 10% FBS at 37°C and 5% CO2. HCT116 cell line, a colorectal cancer cell line, was a gift from Dr. Bert Vogelstein of Johns Hopkins University and was grown in McCoy’s 5A medium with 10% FBS at 37°C and in 5% CO2. W12 subclones 20861 (HPV16+, integrated) and 20863 (HPV16+, episomal) were a gift from Dr. Paul Lambert of University of Wisconsin, Madison, and grown on mitomycin C-pretreated NIH3T3 feeder cells in F12 medium at 37°C and in 5% CO2. BCBL-1 (body cavity B lymphoma cells with KSHV infection) was obtained from the AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH and cultured in RPMI 1640 containing 10% FBS at 37°C and in 5% CO2.

CaSki cells were treated with 20 µM phosphoinositide 3-kinase (PI3K) inhibitor, LY294002 (Cell Signaling Technology, Danvers, MA, #9901) or 10 µM mitogen-activated protein kinase kinase (MEK) inhibitor, U0126 (Cell Signaling Technology, Danvers, MA, #9903) and examined for their proliferation and viability by CCK-8 assay and RAS effector expression by Western blot.

siRNAs, plasmid constructions and transfections

Synthetic double-stranded siRNA targeting the lnc-FANCI-2 transcript is listed in a supplementary Table S5. Human MAP4K4 siRNA (#M-003971), human IGFBP3 siRNA (#M-004777) and non-targeting control siRNA(#D-001210-01) were purchased from Dharmacon, Inc (Lafayette, CO). Each siRNA transfection into CaSki or other type of cells was carried out by LipoJet in vitro transfection reagent (SignaGen, Frederick, MD, USA). Plasmid pHBL03 has a short isoform of lnc-FANCI-2 cDNA (lnc-FANCI-2a-PA1, GenBank Acc. No. MT669801). Overlapping PCR products generated from 5’ and 3’ RACE PCR products with a primer pair of oHBL 17 and oHBL 18 were digested and inserted into pcDNA3 at Hind III and Xba I sites. The insertion was verified by sequencing. Plasmid pHBL05 has a long isoform of lnc-FANCI-2 (lnc-FANCI-2a-PA2, GenBank Acc. No. MT669800). PCR products generated from genomic DNA with oHBL 19 and oHBL 23 were digested and swapped into pHBL03 to replace lnc-FANCI-2a-PA1 at Bsmb I and Xba I sites.

CRISPR/Cas9 system was used to knockout lnc-FANCI-2 gene from CaSki cells. Two lnc-FANCI-2 specific guide RNAs (gRNAs) were designed by an on-line CRISPR design tool developed by the Zhang lab (http://crispr.mit.edu). A modified cloning strategy was used to clone the two gRNA into pSpCas9(BB)-2A-Puro vector (Addgene, Watertown, MA, #62988) and described in our previous publication [53]. This strategy allows all three components (two gRNAs and Cas9) express from the same plasmid to ensure the transfected cells expressing two gRNAs simultaneously knocking out the targeted gene [53]. The sequences of primers to make guide RNAs were listed in Supplementary Table S5.

Knockout of lnc-FANCI-2 gene in CaSki cells, PCR screening and genotyping

Two gRNAs each under control by a U6 promoter in a pSpCas9(BB)-2A-Puro vector were used to create deletion in CaSki cells. Plasmid pHBL25 which contains gRNA 1 and 6 was used to delete the entire promoter region of lnc-FANCI-2 and plasmid pHBL27 which contains gRNA 3 and 4 was used to delete only two YY1-binding motifs spanning over an 87-bp region in the lnc-FANCI-2 promoter. CaSki cells transfected with the gRNA-expressing vectors were then under 2 weeks of puromycin selection (0.1 µg/ml). The puromycin-resistant CaSki cells were diluted to a final concentration of 0.5 cells per 100 μl, plated 100 μl of diluted cells into each well of a 96-well plate, and grew under continuous purymycin selection. The colonies were inspected to ensure the clones were grown from one cell, and refed or re-plated as needed in 3 weeks of expansion.

A direct PCR was performed for number of single cell clones as described [53]. Briefly, several hundreds of cells expanded from a single cell clone were resuspended in 50 μl of phosphate-buffered saline (PBS) and then frozen-thawed on dry ice for three times. Cell lysates were treated with 4 μl of Qiagen protease (QIAGEN, Hilden, Germany, #1017782,) at 56°C for 10 min, and the protease was inactivated at 95°C for 5 mins. The resulting product was directly used for PCR screening. P1 + P5 were used for verification of promoter deletion and P3 + P4 for YY1 motifs deletion. To ensure the selected single cell clones displaying homozygous knockout of lnc-FANCI-2, total cell DNA was isolated using a QIAamp DNA Blood Minikit (QIAGEN, Hilden, Germany, #51106) and then the homozygous deletion was screened and the cells with homozygous KO were selected and confirmed by PCR. The primers used for genotyping are listed in supplementary Table S5. At least two single cell clones either with lnc-FANCI-2 promoter deletion or with YY1 motif deletion were selected. Three single cell clones from the cells transfected with an empty vector, pSpCas9(BB)-2A-Puro, were also selected and served as controls for the genotyping screening.

Cell proliferation and viability assay

Cell proliferation and viability were determined by a CCK-8 assay (Dojindo Molecular Technologies, Rockville, MD). For proliferation assay, 500 µl of cell suspension (50,000 cells/well) were dispensed into individual wells in a 24-well plate. Cells were then treated with inhibitors (20 μM LY294002, or 10 μM U0126 respectively) after 24 h culture. At each time point of 24 h, 48 h and 72 h, 50 µl of CCK-8 solution was added in three wells of each group and incubated for 1 h in a cell culture incubator at 37°C. Cell proliferation was determined by absorbance at 450 nm, and normal medium was used to subtract background. Percentage of cell viability was calculated by the following formula: cell viability % = OD450 of inhibitors treated sample / OD450 of untreated sample × 100%.

Nuclear and cytoplasmic fractionation

CaSki and SiHa cells were fractionated by using Nuclei EZ Prep Kit (Sigma-Aldrich, #NUC-101) following the manufacturer’s protocol. The details of the method could be found in our previous publication [93]. Western blot analysis for nuclear protein SRSF3 and cytoplasmic protein GAPDH was used to determine the fractionation efficiency. The fractionated cytoplasmic and nuclear RNAs were used for detection of lnc-FANCI-2 RNA with human GAPDH RNA serving as a control for RNA fractionation efficiency of the cytoplasmic RNAs.

RT-qPCR

Detection of lnc-FANCI-2 by RT-qPCR was performed as described [8]. Pre-designed primers for lnc-FANCI-2 are listed in Table S5. Briefly, 2 µg total RNA was converted to cDNA using Superscript First-stand Synthesis kit (Thermo Fisher Scientific, Waltham, MA, #11904018). qPCR was performed using TaqMan gene expression Master Mix (Applied Biosystems, Waltham, MA, #4369016) on a StepOne Plus Real-Time PCR system (Applied Biosystems). TaqMan Gene Expression Assay (Applied Biosystems, #4331182) for GADPH (Hs02758991_g1) was served as an internal control. Data were plotted as fold change over the control group using the 2-ΔΔCt method by which the data was normalized first to the values for GAPDH and then to the median value for control samples. Data are presented as a bar graph with mean ± SD for each group.

Northern blot

To validate lnc-FANCI-2 KO in CaSki cells, Northern blot was performed using polyA+ mRNA isolated from 100 µg of total CaSki RNA with PolyATtract mRNA Isolation Systems (Promega, #Z5310) or 10 µg of total CaSki total RNA as described previously [8]. Briefly, RNA samples were denatured in NorthernMax Formaldehyde loading dye (Thermo Fisher Scientific, #AM8552) at 75°C for 15 min, then separated in an 1% formaldehyde-containing agarose gel and transferred onto a GeneScreen Plus hybridization transfer membrane (Perkin Elmer, Waltham, MA, #NEF987001PK). RNAs on the membrane were crosslinked by exposing to UV light, and then prehybridized with PerfectHyb Plus hybridization buffer (Sigma-Aldrich, St. Louis, MO, #H7003) for 2 h at 42°C. Specific oligos against lnc-FANCI-2 or GAPDH listed in Table S5 were labeled with [γ-32P]-ATP using T4 PNK (Thermo Fisher Scientific, Waltham, MA, #18004–010) and added into hybridization buffer for overnight incubation at 42°C. The membrane was then washed with 2× SSPE/0.5% SDS solution for 5 mins, followed by twice washes with 0.5× SSPE/0.5% SDS each for 15 min at 42°C and then exposed to a PhosphorImager screen.

RNA-seq and data analysis

Total RNA was from WT CaSki cells, ΔYY1-D5 and ΔPr-A9 cells were extracted using TRIzol reagent (Thermo Fisher Scientific, Waltham, MA, #15596018). Total ribo-minus RNA-seq libraries, four samples in each group, were prepared with TruSeq Stranded Total RNA Library Kit and then subjected to pair-end sequencing using Illumina-HiSeq3000/4000 platform.

The obtained reads were processed using the CCBR Pipeliner utility (https://github.com/CCBR/Pipeliner). Briefly, reads were trimmed from adapters and low-quality bases using Cutadapt (version 1.18) (https://bioweb.pasteur.fr/packages/pack@cutadapt@1.18) before alignment to the custom reference genome described below. The transcripts were aligned using STAR v2.5.2b in two-pass mode [94]. Expression levels were quantified using RSEM (version 1.3.0) [95] with a custom gene annotation described below.

The custom reference genome allowing quantification of both HPV16 and host expression used in this alignment consisted of the human reference genome (hg38/Dec. 2013/GRCh38) with a HPV16 FASTA (https://pave.niaid.nih.gov/) sequence added as an additional pseudochromosome. The custom gene annotation used for gene expression quantification consisted of a concatenation of the hg38 GENCODE annotation version 30 [96] and the HPV16 genome, with one other notable alteration. The hg38 v30 gene annotation is incorrect at the location of the lnc-FANCI-2 locus in the hg38 genome at chr15:89378104-89398487, which affects quantification of this gene. To correct this annotation and provide accurate quantification, therefore, we first performed BLAST against hg38 using the sequences of the 14 known isoform transcripts of lnc-FANCI-2 (MT669800 for lnc-FANCI-2a-PA2, MT669801 for lnc-FANCI-2a-PA1, MT669802 for lnc-FANCI-2b-PA2, MT669803 for lnc-FANCI-2b-PA1, MT669804 for lnc-FANCI-2c-PA2, MT669805 for lnc-FANCI-2c-PA1, MT669806 for lnc-FANCI-2d-PA2, MT669807 for lnc-FANCI-2d-PA1, MT669808 for lnc-FANCI-2e-PA2, MT669809 for lnc-FANCI-2e-PA1, MT669810 for lnc-FANCI-2f-PA2, MT669811 for lnc-FANCI-2f-PA1, MT669812 for lnc-FANCI-2g-PA2, and MT669813 for lnc-FANCI-2g-PA1). The results were used to determine the precise exon start and stop sites for each of the isoforms. These were then used to manually adjust the default hg38 v30 gene annotations at the locus indicated above to account for the correct structure of the lnc-FANCI-2 locus and its 14 isoforms.

Downstream analysis and visualization were performed within the NIH Integrated Analysis Portal (NIDAP) using R programs developed on the Foundry platform (Palantir Technologies). Briefly, raw counts data produced by RSEM were imported into the NIDAP platform, genes were filtered for low counts (<1 cpm) and the voom algorithm [97] from the Limma R package (version 3.40.6) [98] was used for quantile normalization and calculation of differentially expressed genes. Pre-ranked GSEA was performed using the Molecular Signatures Database version 6.2 [99, 100] and the fgsea package [101]. Raw data and the analyzed RNA-seq data supporting the findings in this study have been deposited in the NCBI GEO database (the accession#: GSE190904).

Human Soluble Receptor Array

To detect the changes of soluble receptors expressed and released from parental WT CaSki cells and lnc-FANCI-2 KO cells, Proteome Profiler Human sReceptor (Soluble Receptor) Array (R&D Systems, Minneapolis, MN, # ARY012) was performed according to a manufacturer protocol. Briefly, 7 × 106 WT CaSki cells and ΔPr-A9 cells were cultured in 15 ml of DMEM supplemented with 10% FBS in a T75 flask for 24 h. Cell culture supernatants were then collected and centrifuged at 4,000 × g for 15 min to remove cell debris. Cells were rinsed by 10 ml of PBS and solubilized in Lysis Buffer 17 at 4°C for 30 min. The cell lysis was then centrifuged at 14,000 × g for 5 min to remove cell debris. Protein concentration of cell lysis was measured by Micro BCA Protein Assay (PIERCE, Waltham, MA, #23235). Next, two sets of N and C membranes were blocking by Buffer 8/1 in 4-Well Multi-dish for 1 h on a rocking platform. 500 μl of culture supernatant or 100 μg cell lysate generated from the WT CaSki cells or ΔPr-A9 cells were diluted in 3 ml of Buffer 8/1, and then applied to each membrane after three washes with 1× Wash Buffer. The membranes were incubated overnight at 4 °C on a rocking platform and washed three times to remove unbound materials followed by incubation with their specific cocktail of biotinylated detection antibodies for 2 h at room temperature on a rocking platform. 2 ml of diluted Streptavidin-HRP was then applied to each membrane for 30 min at room temperature on a rocking platform after three times of wash. 1 ml of the prepared Chemi Reagent Mix was added onto the membranes after three times of wash. The signal was then captured by ChemiDoc Touch Imaging System (Bio-Rad). Quantification of the relative protein expression in ΔPr-A9 cells compared to the WT cells was determined in Image Lab software (Bio-Rad). The average signal of duplicate spots subtracting background value from negative control spots represented the levels of each protein. The relative change in protein levels was then determined by comparing ΔPr-A9 cells to the WT cells.

Antibodies and Immunoblot

Immunoblot analysis was performed for individual proteins by using the following antibodies: anti-MCAM (#17564-1-AP), anti-E2F1 (#12171-1-AP), anti-MAP4K4 (#55247-1-AP), and anti-p53 (#10442-1-AP) antibodies were from ProteinTech (Rosemont, IL). Anti-tubulin (#T5201) antibodies were from Sigma-Aldrich (St. Louis, MO). Anti-hnRNP C1/C2 (#Ab10294) and anti-RAC3 (#ab129062) antibodies were from Abcam (Cambridge, United Kingdom). Anti-CCND2 (Cyclin D2, #3741), anti-Akt (#9272), anti-phospho-Akt (#9271), anti-Erk1/2 (p44/42 MAPK, #4695), anti-phospho-Erk1/2 (#4370), and anti-IGFBP3 (D1U9C, #25864) were from Cell Signaling Technology (Danvers, MA). Anti-VIM (#MA5-11883) was from Thermo Fisher Scientific (Waltham, MA). Anti-PODXL2 (#AF1524), anti-ECM1 (#MAB39371), anti-TIMP2 (#AF971) and anti-ADAM8 (#AF1031) were from R&D Systems (Minneapolis, MN). Anti-SRSF3 (#NBP2-76892) was from Novus Biologicals (Centennial, CO). Rabbit polyclonal anti-HPV16 E7 (#GTX133411) was from GeneTex (Irvine, CA).

RAS GTPase Chemi ELISA

RAS GTPase activity in cell lysate of the WT CaSki or ΔPr-A9 cells was determined by RAS GTPase Chemi ELISA Kit (Active Motif, Carlsbad, CA, # 52097) according to the manufacturer protocol. Briefly, 2 × 107 cells were solubilized in 500 µl of Complete Lysis/Binding buffer at 4°C for 15 min. Protein concentration of cell extract was then measured by Micro BCA Protein Assay (PIERCE, #23235) after cell debris was removed by centrifuged at 14,000 × g for 10 min. 2 µg of GST-Raf-RBD in 50 µl of Complete Lysis/Binding buffer was coated in each well of a 96-well plate by incubating for 1 h at 4°C with 100 rpm agitation. After three times of wash, 50 µg of extract in 50 µl Complete Lysis/Binding buffer was then added in each well. HeLa (EGF treated) extract was served as a positive control. The plate was incubated for 1 h at room temperature with 100 rpm agitation. The active RAS protein (GST-Raf-RBD binding) in each well was then detected by incubating with primary RAS antibody and three times of wash, and then, HRP-conjugated secondary antibody and three times of wash. Chemiluminescence in each well was read in luminometer by adding 50 µl room-temperature Chemiluminescent Working Solution.

RNA in situ hybridization (RNA-ISH)

Endogenous lnc-FANCI-2 in CaSki cells was examined by RNAScope Multiplex Fluorescent V2 Assay (Advanced Cell Diagnostics, Minneapolis, MN, #323100) as described previously [8]. A custom-designed probe targeting to nt 359-1713 of GenBank Acc. No. MT669800.1 transcript for lnc-FANCI-2a-PA2 was utilized. Dual staining of lnc-FANCI-2 and IGFBP RNA was performed according to RNAscope Multplex Fluorescent v2 Manual part 2 (#323100), and the channel one probe for IGFBP3 (#310351) and channel three probe for lnc-FANCI-2 (#509061-C3) were applied.

For RNA-ISH combined IF staining of MCAM protein, 3 × 105 of CaSki cells were grown on a glass coverslip in a 6-well plate for 24 h before transfection. Plasmid pHBL05 containing a long isoform of lnc-FANCI-2a-PA2 (GenBank Acc. No. MT669800.1) were transfected into cells with LipoD 293 DNA in vitro transfection reagent. Cultured Adherent Cell Sample Preparation for the RNAscope Multplex Fluorescent v2 was performed according to a manufacturer protocol (Advanced Cell Diagnostics, Minneapolis, MN, MK-50-010). Briefly, the cells were washed with PBS and fixed by 10% neutral buffered formalin at room temperature for 30 min. The cells were washed with PBS three times and dehydrated by 50% ethanol for 5 min, 70% ethanol for 5 min, 100% ethanol for 5 min, and 100% ethanol for 10 min. The cells were then rehydrated by 70% ethanol for 2 min, 50% ethanol for 2 min, and PBS for 10 min. The cells were then applied to hydrogen peroxide at RT for 10 min and washed with distilled water for three times. The cells were then applied to Protease III (1:15 dilution with PBS) at RT for 10 min. The cells were ready for RNAscope Multplex Fluorescent v2 Manual part 2 (#323100). IF was performed after RNA-ISH but before counterstaining with DAPI. The cells were washed with PBS, and then were blocked with 2% bovine serum albumin (BSA) in PBS for 1 h at 37°C or overnight at 4°C. The cells were incubated with the MCAM primary antibody (ProteinTech, Rosemont, IL, #17564-1-AP, diluted 1:200 in 2% BSA blocking buffer) for 2 h at 37°C. An Alexa Fluor-conjugated secondary antibody (1: 500, ThermoFisher Scientific) was diluted in a blocking solution and incubated for 1 h at 37°C. The slides were washed with PBS for three times. DAPI was used for nuclei counterstaining before mounting in a Prolong Gold Antifade mounting medium (Thermo Fisher Scientific, #P36934).

Confocal images were collected using a Zeiss LSM710 laser-scanning microscope equipped with a 63 × Plan-Apochromat (N.A. 1.4) objective lens. Three dimensional distributions of lnc-FANCI-2 were generated by Z stacks using ZEN 2.3 software (Zeiss).

Isolation of lnc-FANCI-2 RNA-protein complex by RNA Purification (IRPCRP) and LC-MS/MS analysis

To identify the lnc-FANCI-2 RNA-associated proteins, we pulled down lnc-FANCI-2 RNA from WT CaSki cells using IRPCRP, a modified ChIRP protocol from the published method [102], followed by mass spectrometry [70]. Briefly, a total of 30 lnc-FANCI-2 anti-sense oligo probes (oLLY496-oLLY525), each with biotinTEG at 3’ end, across entire lnc-FANCI-2 RNA (GenBank Acc. No MT669800.1) were designed using the online probe designer (singlemoleculefish.com). Two probe pools were used in IRPCRP assays: the probe pool 1 was the mixed oligos in even numbers (oLLY496/498/500/502/504/506/508/510/512/514/516/518/520/522/524) and the probe pool 2 mixed with the oligos in odd numbers (oLLY497/499/501/503/505/507/509/511/513/515/517/519/521/523/525) (Table S5). CaSki cells were seeded and harvested at 24 h and the cell lysates from 200 million cells were used for each IRPCRP reaction. Instead of formaldehyde crosslink in the standard protocol [102], UV cross-linking (254 nm, energy at 480 mJ/cm2) was applied in our study to minimize non-specific binding. After UV cross-linking, the cells were lysed on ice in 1× radioimmunoprecipitation assay (RIPA) buffer (Boston Bioproducts, Ashland, MA, #BP-115) (50 mM Tris base/Tris-HCl [pH 7.4], 150 mM NaCl, 0.5% sodium deoxycholate, 0.1% sodium dodecyl sulfate [SDS], 1% NP-40) supplemented with protease inhibitors (complete mini-EDTA-free protease inhibitor cocktail, Millipore Sigma, Burlington, MA, #469315900) and RNase inhibitor (Thermo Fisher Scientific, #AM2694) for 30 min, followed by a brief sonication. The obtained cell lysates were spin for 10 min of centrifugation at 20,000 × g at 4°C and the collected cell lysate supernatants were pre-absorbed with C-1 magnetic beads (Thermo Fisher Scientific, #65002) for 1 h at room temperature before proceeding to oligo probe hybridization.

Two different hybridization protocols were used to pull down lnc-FANCI-2 RNA. One had the pre-absorbed cell lysate incubated with two separate probe pools in hybridization buffer (750 mM NaCl, 0.1% SDS, 50 mM Tris-Cl pH 7.0, 1 mM EDTA, 15% formamide) at room temperature overnight and the pre-washed C-1 magnetic beads was then added and incubated for additional 4 h; the other had two separate oligo probe pools immobilized onto pre-washed C-1 magnetic beads first at room temperature for 1 h and then mixed with the pre-absorbed cell lysates in hybridization buffer overnight at room temperature, along with the negative control-beads only IRPCRP reaction without oligo probes serving as a control in the modified protocol. After hybridization, the beads were washed with 1× wash buffer (2× NaCl and sodium citrate, 0.1% SDS) for 5 times and resuspend in 1 ml lysis buffer. 100 µl beads from each IRPCRP were set aside for RNA isolation of lnc-FANCI-2 pulldown efficiency and the leftover beads were sent for mass spectrometry analysis. The input and IRPCRP RNA were isolated using a MIRNeasy mini kit (Qiagen, #217004) following the standard protocol after proteinase K (Millipore Sigma, #71049) digestion.

For mass spectrometry analysis, the IRPCRP beads were processed for trypsin/LysC digestion before submitted to Thermo Scientific Orbitrap Exploris 240 Mass Spectrometer and a Thermo Dionex UltiMate 3000 RSLCnano System for Proteinomics. Peptides from trypsin digestion were loaded onto a peptide trap cartridge at a flow rate of 5 µl/min. The trapped peptides were eluted onto a reversed-phase Easy-Spray Column PepMap RSLC, C18, 2 µM, 100A, 75 µm × 250 mm (Thermo Scientific) using a linear gradient of acetonitrile (3-36%) in 0.1% formic acid. The elution duration was 110 min at a flow rate of 0.3 µl/min. Eluted peptides from the Easy-Spray column were ionized and sprayed into the mass spectrometer, using a Nano Easy-Spray Ion Source (Thermo Fisher Scientific) under the following settings: spray voltage, 1.6 kV, Capillary temperature, 275 °C. Other settings were empirically determined. Raw data files were searched against human protein sequences database using the Proteome Discoverer 2.4 software (Thermo Fisher Scientific, San Jose, CA) based on the SEQUEST algorithm. All the protein peptides identified in the IRPCRP pulldowns were summarized in Table S4.

Acknowledgements

We thank Louise T. Chow of the University of Alabama at Birmingham for her critical reading of the manuscript. We thank Craig Meyers of Penn State University Hershey Medical Center for providing HPV16- and HPV18-infected raft cultures, Xing Xie and Yang Li of Zhejiang University Women’s Hospital of China for cervical tissue sections, and Johannes G. Schweizer from Arbor Vita corporation for anti-HPV16 E6-specific antibodies. This study was supported by the Intramural Research Program of the NIH, National Cancer Institute, Center for Cancer Research.

Data availability

NCBI GEO database Acc. No. GSE190904 for CaSki cell RNA-seq.

Additional information

Authors contributions

H. L., L. Y., and Z.M.Z. designed the study. H. L. and L. Y. performed the experiments. H. L., L. Y., V.M., T. M., M. Y., P. J., M.C., D. R. L., and Z.M.Z. analyzed data and participated in the discussion and interpretation of the results. H. L., L. Y., and Z.M.Z drafted the manuscript. H. L., L. Y., and Z.M.Z. revised the manuscript. All authors read and approved the final manuscript.

Declaration of interests

The authors declare no competing interests.

Supporting Figures

Selective expression of lnc-FANCI-2 RNA in a subset of cervical cancer cell lines. (A) RT-qPCR detection of lnc-FANCI-2 expression in HPV-positive cervical cancer cell lines: SiHa (HPV16+), CaSki (HPV16+), HeLa (HPV18+), C4II (HPV18+), and W12 subclones 20861 (HPV16+, integrated) and 20863 (HPV16+, episomal) and HPV-negative cell lines: C33A (cervical cancer cells with mutations of p53 and pRb), HCT116 (colorectal cancer cells), BCBL-1 (body cavity B lymphoma cells), HEK293 (Ad5 E1/E2-immortalized human kidney cells), and HaCaT (spontaneously immortalized human epidermal cells). (B) HeLa cells express no lnc-FANCI-2 when compared with C33A cells as determined by Northern blot. (C) Subcellular lnc-FANCI-2 by fractionation determines lnc-FANCI-2 being predominantly cytoplasmic in CaSki but nuclear in SiHa cells. Cytoplasmic and nuclear fractionation efficiency was determined by Western blot analyses of nuclear protein SRSF3 (serine- and arginine-rich splicing factor 3) and cytoplasmic GAPDH protein. Total fractionated cytoplasmic and nuclear RNAs from CaSki and SiHa cells were quantified for lnc-FANCI-2 by RT-qPCR, with cytoplasmic GAPDH RNA serving as internal controls for RNA fractionation efficiency. (D) Subcellular lnc-FANCI-2 (red) localization in CaSki and SiHa cells was examined by RNAscope RNA ISH analysis. Nuclei were stained with DAPI (blue). Scale bars: 25 μm in the left and 10 μm in the zoom.

Characterization of lnc-FANCI-2 KO on CaSki cell growth and viral oncoprotein expression. (A) The characteristic cell growth behavior and morphology of the wild type (WT) and single cell clones with deletion of ΔYY1 motifs (ΔYY1, clone # D5) and ΔPromoter motif (ΔPr, clone # A9) were imaged at 24 h after spreading. (B) Effect of lnc-FANCI-2 KO in CaSki cells on the expression of HPV16 E6 and E7 and their downstream targets. Total cell extracts from parental CaSki cells and ΔPr-A9 cells were examined by immunoblotting with corresponding antibodies. Tubulin served as a protein loading control. The relative protein levels of E6, E7, p53, and E2F1 were calculated after normalizing to tubulin.

GSEA Enrichment plots show enrichment score and gene hits enriched in ΔYY1-D5 cells using Hallmark gene sets, (A) KRAS_SIGNALING_UP. (B) EPITHELIAL_MESENCHYMAL_TRANSITION. (C) INTERFERON_ALPHA_RESPONSE. (D) INTERFERON_GAMMA_RESPONSE. The heatmap on the right side of Enrichment plot in each panel visualize the genes with differentiated expression enriched in each pathway in ΔYY1-D5 cells.

Pathway map of RAS Initiative with highlighted differentially expressed genes (DEGs) in ΔPr-A9 cells vs parental wild type cells. DEGs are mainly distributed and clustered in two main branches of RAS pathway: MAPK signaling branch and PI3K/AKT branches. DEGs were derived from RNA-seq data at cutoff of adjusted p-value <=0.01 and fold change>1.5 for both up- and down-regulated genes (indicated by red and blue arrows respectively) between A9 cells vs WT CaSki cells using three common analysis methods (DESeq2, edgeR, and limma-voom) and in-house BRB analysis pipeline from NCI. Highlighted genes in the pathway map are derived as DEGs in at least three of the analysis methods, although in most cases, these DEGs behaved consistently across all 4 methods. MAPK signaling branch is highlighted in transparent green box and PI3K/AKT branch is highlighted in transparent yellow box within the pathway map of RAS signaling pathway, which was collectively collated from community inputs that were organized and stimulated by RAS Initiative, and maintained as a common knowledge basis at URL below: https://www.cancer.gov/research/key-initiatives/ras/ras-central/blog/2015/ras-pathway-v2, which was also described in recent review (Figure 1 in Nissley and McCormick, 2022).

Reference (where the Ras pathway map has been described):

Nissley, D. V. and McCormick, F. (2022). RAS at 40: update from the RAS Initiative. Cancer Discov. https://doi.org/10.1158/2159-8290.CD-21-1554

Knockout or Knockdown of lnc-FANCI-2 affects RAS signaling. (A) Selective validation of increased expression of RAS signaling-related downstream genes in ΔPr-A9 and ΔYY1-D5 cells. (B) KD of lnc-FANCI-2 expression in SiHa cells enhanced phosphorylation of Akt and Erk1/2 but did not in HeLa cells expressing no lnc-FANCI-2.

IGFBP3 and MCAM expression and lnc-FANCI-2. (A-B) RNA-seq reads-coverage of IGFBP3 and MCAM by IGV illustrates the increased expression of IGFBP3 and MCAM in lnc-FANCI-2 KO cells ΔPr-A9 and/or ΔYY1-D5 when compared to the parental WT CaSki cells.

Expression levels of lnc-FANCI-2 and MCAM are significantly associated with TCGA CESC cancer patients’ survival outcome. The expression of Inc-FANCI-2 (A) and of MCAM (B) along with survival of the same group of 304 cervical cancer patients were analyzed by our in-house GradientScanSurv pipeline as described (Yi et al 2018, PLoS One 13(12):e0207590). The open grey circles in each plot are ranked expression levels from high (left) to low (right) of Inc-FANCI-2 (A) and MCAM (B) from all patient samples as indicated by y-axis at the right side of the plot (in RSEM normalized value of RNAseq data). At each breaking point denoted by patient number (x-axis), the log rank test (y-axis at the left side of the plot) was performed between the higher expression group at the left side of the cut-point vs the lower expression group at the right side of the cut-point and the indicated log rank p-values (blue dots) are shown as at each cut-point at x-axis. The horizontal red line is the p=0.05 cutoff line for the log rank test p-values. If a log rank p-value (blue dot) at a cut-point is below the cutoff (p<0.05), there will be a vertical green line shown at the corresponding cut-point, indicative of a significant log rank test p-value, and also a brown diamond plotted either at the bottom part of the vertical green line (A) if higher expression of lnc-FANCI-2 led to less severe outcome (dying slower than the lower expressors), or at the top part of the vertical green line (B) if higher expression of MCAM led to more severe outcome (dying faster than the lower expressors).

The original GoodCount was defined as the number of significant log rank tests (the default significance level was set as 0.05) across all possible cut-points for the original dataset. We employed the bootstrap approach to permutate the data for 1000 times in this case, and for each time, a similar procedure was performed like the original dataset, and a permutated GoodCount was obtained for each permutated dataset. The GoodCount p-value was derived as the proportion of the times that the GoodCounts of permutated datasets were no less than that of the original dataset (see Yi et al 2018, PLoS One 13(12):e0207590) and shown at the left top corner of each plot, 0.014 for Inc-FANCI-2 in panel A and 0.004 for MCAM in panel B. A significant GoodCount p-value (blue dots in both panels A and B) is indicative of significant association of the expression levels with the corresponding survival outcomes of the analyzed patient samples. Derived p-values from univariant expression-based cox regression models (coxph p-value: derived by original gene expression values; or coxph p-value By Rank: derived by corresponding ranks of gene expression values) were shown at the top left corner of each plot. The cut-point-specific FDR (green dots) and FDR2 (red dots) were shown at the top right corner of each plot. FDR for each cut-point was defined as the portion of cases of permutated datasets that have log rank test p-values no higher than that of original dataset. FDR2 for each cut-point was defined as the portion of cases of permutated datasets that have log rank test p-values no higher than default setting of significance as 0.05.