Increased expression of lnc-FANCI-2 in HPV16-infected cervical cancer tissues and HPV16-and HPV18-infected raft culture tissues. (A) Expression lnc-FANCI-2 in HPV16+ cervical cancer tissues were examined by RNAscope RNA-ISH analysis. Nuclei were stained with DAPI (blue). (B) Human foreskin keratinocyte (HFK)-derived raft cultures without (HFK) or with HPV16 or HPV18 infections (HFK-HPV16 or HFK-HPV18) were examined at day 10 for lnc-FANCI-2 RNA by RNAscope RNA-ISH analysis. Scale bars: 25 μm in the original figures and 10 μm in the zoomed insets.

Knockout (KO) of lnc-FANCI-2 in CaSki cells affects cell proliferation, colony formation and migration. (A) Diagram and KO strategies of lnc-FANCI-2 gene. On the figure top is the lnc-FANCI-2 gene structure and its alternative transcription start sites (TSS) and polyadenylation sites (pA). TSS2 or pA2 with a heavier arrow are predominately used for lnc-FANCI-2 expression. The lower figure part shows KO strategies. Red slashes represent gRNA-targeted sites to create genomic DNA deletions by CRISPR-Cas9 technology. Deletion of YY1-binding motifs (ΔYY1) led to a deletion of 86-bp DNA fragment containing two YY1-binding motifs. Deletion of lnc-FANCI-2 promoter (ΔPr) led to delete a ∼3.3-kb promoter region. Primers (P1-P4) used for PCR screening were shown as black arrows. (B) PCR screening of single cell clones with homozygous lnc-FANCI-2 KO, with indicated primer sets diagramed (A). Single cell clones selected from the cells transfected with an empty vector served as control (Ctrl) cells. (C) Evaluation of lnc-FANCI-2 KO efficiency in the selected single cell clones (B) by RT-qPCR. (D) Northern blot validation of lnc-FANCI-2 KO efficiency from the individual single cell clones or the parental (WT) CaSki cells on polyA+ RNA (enriched from 100 μg of total cell RNA, lanes 1-4) or 10 μg of total cell RNA (lanes 5-9). Total cell RNA (2 μg) of HEK293T cells with ectopic expression of two isoforms (short, S or long, L) of lnc-FANCI-2 cDNA served as a control. Antisense oligo probe P5 (A) labeled with 32P was used for hybridization. GAPDH RNA served as a loading control and hybridized with a GAPDH-specific oligo probe. (E-F) RNA in situ hybridization (RNA-ISH) validation of lnc-FANCI-2 KO efficiency by RNAscope technology. Two single cell clones were examined with red color for lnc-FANCI-2 and blue color for the nucleus (E). Bar graphs show the copy number of lnc-FANCI-2 per cell in the WT, ΔYY1-D5 or ΔPr-A9 CaSki cells (each averaged from 200 cells) (F). **, P<0.01; ***, P<0.001 by two tailed Student t test.

KO of lnc-FANCI-2 in CaSki cells affects expression of cellular soluble receptors. (A) Dot blots shows differentially expressed soluble receptors in ΔPr-A9 cells when compared to those in the WT CaSki cells. A Proteome Profiler Human sReceptor Array was used to examine 105 cellular soluble receptors. Label in red, upregulated soluble receptors; label in blue, downregulated soluble receptors. (B) Quantification of differentially expressed and/or released soluble receptors (A) in ΔPr-A9 cells when compared to those in the WT CaSki cells. Error bar represents standard deviation of replicates in each proteome array. The dash line indicates the threshold of fold change (FC) with a score above or below the threshold (-/+ 0.3 FC) being determined as differentially expressed and/or released soluble receptors. (C) Validation of several differentially expressed soluble receptors by immunoblot analysis using specific antibodies. Tubulin served as an internal loading control. * non-specific band in MCAM immunoblot.

Transcriptomic effect of lnc-FANCI-2 KO in CaSki cells by RNA-seq analysis. (A) RNA-seq reads-coverage maps by IGV showing the expression levels of lnc-FANCI-2 from two KO cell clones, ΔPr-A9 and ΔYY1-D5, to the WT CaSki cells. one representative coverage profile of four in each type of cells is shown by IGV. Red slashes represent the deleted genomic region, with validated fourteen lnc-FANCI-2 RNA isoforms shown below [8]. (B) Similarity in heatmap comparison among ΔPr-A9, ΔYY1-D5 and WT cells was generated using Limma-normalized counts from each sample by Pearson complete linkage method. (C) Volcano plot visualization of differentially expressed genes (DEGs) in ΔPr-A9 and ΔYY1-D5. The genes with the most significant change in the increased (red) or decreased (blue) expression are indicated. FC, fold change. (D) Venn diagram depicting the overlapped DEGs between ΔPr-A9 and ΔYY1-D5 cells over the WT CaSki cells. (E) Heatmap shows overlapped DEGs with FPKM ≥7 in ΔPr-A9 and ΔYY1-D5 cells when compared with the WT CaSki cells. (F) Validation of selective 12 upregulated or downregulated DEGs shown in the heatmap by RT-PCR with gene-specific primers in the presence (+) or absence (-) of reverse transcriptase (RT). GAPDH served as an internal RNA control. Relative expression level of each gene was calculated based on band density after normalizing to GAPDH RNA band, with the expression level in the WT CaSki cells setting as 1. (G) Validation of the upregulated expression of IGFBP3 in ΔPr-A9 cells over the WT CaSki cells by immunoblot analysis.

Pathway analyses of DEGs identified by RNA-seq. (A) Top 3 upregulated and top 4 downregulated pathways in ΔPr-A9 and ΔYY1-D5 cells over the WT CaSki cells. Data were generated by Gene set enrichment analysis (GSEA) performed with Hallmark gene sets. NES stands for normalized enrichment score. (B-E) Each GSEA Enrichment plot shows the enrichment score and gene hits enriched in ΔPr-A9 cells using Hallmark gene sets, KRAS_SIGNALING_UP (B), EPITHELIAL_MESENCHYMAL_TRANSITION (C), INTERFERON_GAMMA_ RESPONSE (D) or INTERFERON_ALPHA_RESPONSE (E). padj, adjusted p value; Zero crossed 7342, the middle number of total genes in the GSEA at ranking. The heatmaps below the enrichment plots (B and C) visualize the genes enriched in respective pathways of KRAS_SIGNALING_UP (B) and EPITHELIAL_MESENCHYMAL_TRANSITION (C) in ΔPr-A9 cells when compared to the WT CaSki cells.

CaSki cells with lnc-FANCI-2 KO exhibit activation of RAS signaling pathway. (A) CaSki cells with lnc-FANCI-2 KO (top, ΔPr-A9 cells) or knockdown (KD, lower) by lnc-FANCI-2-specific siRNAs display increased RAS GTPase activity when compared to the WT CaSki cells (top) or the WT cells treated with a nonspecific control siRNA (siNS). The relative RAS GTPase activity was measured by a RAS GTPase Chemi ELISA assay. *, P< 0.05; ***, P<0.001 by two tailed Student t test. (B and C) Selective validation of increased expression of RAS signaling-related downstream genes in lnc-FANCI-2 KO ΔPr-A9 cells (B) or lnc-FANCI-2 KD CaSki cells (C) over the WT CaSki cells. Relative expression of the indicated genes in ΔPr-A9 cells in comparison to the WT cells (B) or in the WT CaSki cells treated for 48 h and 96 h with a nonspecific control siRNA (siNS) or a siRNA specifically targeting the lnc-FANCI-2 exon 3 (C) were examined by immunoblot analyses using corresponding antibodies as indicated. Tubulin served as an internal control for each blot. The level of p-Akt or p-Erk was calculated by normalizing to total Akt or Erk protein and the other proteins by normalizing to tubulin. The protein level in the WT cells or WT cells treated with siNS set as 1. The KD efficiency of lnc-FANCI-2 RNA (C, top bar graphs) was examined by RT-qPCR. (D) The effect of blocking RAS signaling on the expression of MCAM and VIM using PI3K inhibitor LY294002 (20 μM) and MEK inhibitor U0126 (10 μM). The protein at each time point was examined by immunoblot analysis with a corresponding antibody. The level of MCAM or VIM was calculated after normalizing to tubulin. The protein level in ΔPr-A9 cells without the inhibitor was set as 100%. *, nonspecific protein band. (E) The time dependent cell viability of the WT CaSki and ΔPr-A9 cells in the presence of 20 μM LY294002 or 10 μM U0126. Data was obtained in each time point after normalizing to the cells treated with DMSO. The mean + SD at each data point was calculated from 6 samples combined from two independent experiments.

lnc-FANCI-2 is suppressive to the expression of IGFBP3 and MCAM. (A and B) Transient rescue expression of lnc-FANCI-2 in Δpr-A9 cells inhibits expression of IGFBP3. ΔPr-A9 cells were transfected with a major isoform lnc-FANCI-2a-PA2 (GenBank ACC. No. MT669800.1) cDNA plasmid. lnc-FANCI-2 in red and IGFBP3 RNA in green were detected by RNAscope RNA-ISH at 24h post transfection using each specific antisense RNA probe and imaged by confocal microscopy (A). Expression levels of lnc-FANCI-2 RNA and IGFBP3 RNA in four neighboring cells (B) were measured by signal intensity of a line crossing over the stained cells in A (white line arrow). (C) Validation of differential expression of MCAM RNA in ΔPr-A9 cells by RT-PCR in the presence (+) or absence (-) of reverse transcriptase (RT). One pair of primers with one primer at the exons 11 and the other at exon 13 of MCAM RNA (NM_006500) were used. GAPDH served as a loading control. (D) The major isofrom lnc-FANCI-2a-PA2 repressed the expression of MCAM. ΔPr-A9 cells were transfected with lnc-FANCI-2a-PA2 cDNA plasmid. lnc-FANCI-2 RNA in red was detected by RNAscope ISH and MCAM protein in green was detected by IF with an anti-MCAM antibody. (E) Expression levels of lnc-FANCI-2 RNA and MCAM protein in three neighboring cells were measured by signal intensity of a line crossing over the stained cells (D, white line arrow). (F) Calculation of the expression levels of MCAM in lnc-FANCI-2 positive cells (n=35) and lnc-FANCI-2 negative cells (n=54) by fluorescent intensity. (G) Subcellular MCAM distributions in the nucleus (N) and cytoplasm (C) of WT CaSki cells and ΔPr-A9 by immunoblot analysis. Fractionation efficiency and sample loading were controlled by cytoplasmic (represented by tubulin) and nuclear (represented by hnRNP C1/C2) proteins. (H-I) Correlation and survival analysis of lnc-FANCI-2 and MCAM expression with cervical squamous cell carcinoma (CESC) cases from the cancer genome atlas (TCGA) datasets by GEPIA web server (http://gepia.cancer-pku.cn/). The negative correlation at R=-0.321 (with p-value=1.02e-08) of Inc-FANCI-2 with MCAM in cervical cancer patients was obtained from the RNA-seq data from the TCGA for CESC tumor type downloaded from the TCGA data portal (https://portal.gdc.cancer.gov/). Only primary solid tumor samples (n=304 after exclusion of 2 metastatic samples and 3 normal samples) were subjected to analysis, with the data showing as a Scatter plot (H). Kaplan-Meier plot with log rank test p-value at 5.74e-05 (I) shows MCAM a biomarker for poor prognosis of cervical cancer survival with a lower quartile group cutoff. RNA-seq and survival data are derived from the TCGA CESC cancer patients. (J/K) AKT and ERK phosphorylation is partially regulated by MCAM and IGFBP3 in CaSki cells. KD of MCAM (J) and IGFBP3 (K) expression in WT parental CaSki (left) or ΔPr-A9 (right) cells was performed by treatment of MCAM siRNA or IGFBP3 siRNA along with or without lnc-FANCI-2 siRNA for 48 h. Expression levels of individual proteins as indicated were examined by immunoblot analyses using the corresponding antibodies. The level of MCAM, p-Akt, or p-Erk1/2 was calculated after normalizing to tubulin. The protein level in a non-targeting siRNA (siNS) control cells was set as 1.

lnc-FANCI-2 interacts with host factors to regulate RAS signaling pathway. (A) The lnc-FANCI-2-associated proteins in the WT CaSki cells were identified by isolation of RNA-protein complexes using RNA purification (IRPCRP)-Mass spectrometry technology. (B) lnc-FANCI-2 RNA in the IRPCRP-1 and IRPCRP-2 pulldowns had the pooled antisense biotinylated oligos (pool 1 with oligos in even numbers and pool 2 with oligos in odd numbers) immobilized to avidin-beads first before mixed with cell lysates, while the IRPCRP-3 and IRPCRP-4 pulldowns had the oligos pool 1 and 2 separately mixed with cell lysates first before addition to the avidin-beads for the RNA pull-downs. RT-PCR in the absence (-) or presence (+) of reverse transcriptase (RT) was carried out using the RNA isolated from the individual IRPCRP experiments using a primer pair of oHBL5 and oHBL12 (Table S5) specific for lnc-FANCI-2 RNA detection. Beads only (no oligos) IRPCRP experiments served as a negative control. Total RNA from the WT CaSki cells after sonication was used as an input control. Arrow indicates the detected lnc-FANCI-2 RNA. (C) Proteins associated with lnc-FANCI-2 RNA identified from lnc-FANCI-2 IRPCRP pulldowns. A total of 32 proteins were specifically pulled down from lnc-FANCI-2 IRPCRP reactions 1-4 (PSM≥2 from two separate pulldowns, Table S4), with top 10 proteins binding lnc-FANCI-2 shown in the order by the number of identified PSM. (D) Expression of p-Akt and p-Erk from CaSki WT cells 48 h after siRNA-mediated KD of MAP4K4 or lnc-FANCI-2 was immunoblotted by the corresponding antibodies. GAPDH served as a sample loading control. Fold change of the indicated proteins in the cells with KD of MAP4K4 or lnc-FANCI-2 over the cells treated by a non-targeting siRNA (siNS) was calculated after normalizing to GAPDH. (E) A proposed model illustrates how lnc-FANCI-2-protein complexes inhibits the RAS signaling pathway to control Akt/ErK phosphorylation and expression of host genes. In the absence of lnc-FANCI-2 in CaSki cells, RAS signaling can be triggered by extracellular stimuli, such as IGFBP3, MCAM ligands, etc. As a result, phosphorylation of AKT and Erk leads to cascaded responses of transcription factors (TFs) to regulate the expression of RAS signaling responder genes, such as IGFBP3, MCAM, etc.

Selective expression of lnc-FANCI-2 RNA in a subset of cervical cancer cell lines. (A) RT-qPCR detection of lnc-FANCI-2 expression in HPV-positive cervical cancer cell lines: SiHa (HPV16+), CaSki (HPV16+), HeLa (HPV18+), C4II (HPV18+), and W12 subclones 20861 (HPV16+, integrated) and 20863 (HPV16+, episomal) and HPV-negative cell lines: C33A (cervical cancer cells with mutations of p53 and pRb), HCT116 (colorectal cancer cells), BCBL-1 (body cavity B lymphoma cells), HEK293 (Ad5 E1/E2-immortalized human kidney cells), and HaCaT (spontaneously immortalized human epidermal cells). (B) HeLa cells express no lnc-FANCI-2 when compared with C33A cells as determined by Northern blot. (C) Subcellular lnc-FANCI-2 by fractionation determines lnc-FANCI-2 being predominantly cytoplasmic in CaSki but nuclear in SiHa cells. Cytoplasmic and nuclear fractionation efficiency was determined by Western blot analyses of nuclear protein SRSF3 (serine- and arginine-rich splicing factor 3) and cytoplasmic GAPDH protein. Total fractionated cytoplasmic and nuclear RNAs from CaSki and SiHa cells were quantified for lnc-FANCI-2 by RT-qPCR, with cytoplasmic GAPDH RNA serving as internal controls for RNA fractionation efficiency. (D) Subcellular lnc-FANCI-2 (red) localization in CaSki and SiHa cells was examined by RNAscope RNA ISH analysis. Nuclei were stained with DAPI (blue). Scale bars: 25 μm in the left and 10 μm in the zoom.

Characterization of lnc-FANCI-2 KO on CaSki cell growth and viral oncoprotein expression. (A) The characteristic cell growth behavior and morphology of the wild type (WT) and single cell clones with deletion of ΔYY1 motifs (ΔYY1, clone # D5) and ΔPromoter motif (ΔPr, clone # A9) were imaged at 24 h after spreading. (B) Effect of lnc-FANCI-2 KO in CaSki cells on the expression of HPV16 E6 and E7 and their downstream targets. Total cell extracts from parental CaSki cells and ΔPr-A9 cells were examined by immunoblotting with corresponding antibodies. Tubulin served as a protein loading control. The relative protein levels of E6, E7, p53, and E2F1 were calculated after normalizing to tubulin.

GSEA Enrichment plots show enrichment score and gene hits enriched in ΔYY1-D5 cells using Hallmark gene sets, (A) KRAS_SIGNALING_UP. (B) EPITHELIAL_MESENCHYMAL_TRANSITION. (C) INTERFERON_ALPHA_RESPONSE. (D) INTERFERON_GAMMA_RESPONSE. The heatmap on the right side of Enrichment plot in each panel visualize the genes with differentiated expression enriched in each pathway in ΔYY1-D5 cells.

Pathway map of RAS Initiative with highlighted differentially expressed genes (DEGs) in ΔPr-A9 cells vs parental wild type cells. DEGs are mainly distributed and clustered in two main branches of RAS pathway: MAPK signaling branch and PI3K/AKT branches. DEGs were derived from RNA-seq data at cutoff of adjusted p-value <=0.01 and fold change>1.5 for both up- and down-regulated genes (indicated by red and blue arrows respectively) between A9 cells vs WT CaSki cells using three common analysis methods (DESeq2, edgeR, and limma-voom) and in-house BRB analysis pipeline from NCI. Highlighted genes in the pathway map are derived as DEGs in at least three of the analysis methods, although in most cases, these DEGs behaved consistently across all 4 methods. MAPK signaling branch is highlighted in transparent green box and PI3K/AKT branch is highlighted in transparent yellow box within the pathway map of RAS signaling pathway, which was collectively collated from community inputs that were organized and stimulated by RAS Initiative, and maintained as a common knowledge basis at URL below: https://www.cancer.gov/research/key-initiatives/ras/ras-central/blog/2015/ras-pathway-v2, which was also described in recent review (Figure 1 in Nissley and McCormick, 2022).

Reference (where the Ras pathway map has been described):

Nissley, D. V. and McCormick, F. (2022). RAS at 40: update from the RAS Initiative. Cancer Discov. https://doi.org/10.1158/2159-8290.CD-21-1554

Knockout or Knockdown of lnc-FANCI-2 affects RAS signaling. (A) Selective validation of increased expression of RAS signaling-related downstream genes in ΔPr-A9 and ΔYY1-D5 cells. (B) KD of lnc-FANCI-2 expression in SiHa cells enhanced phosphorylation of Akt and Erk1/2 but did not in HeLa cells expressing no lnc-FANCI-2.

IGFBP3 and MCAM expression and lnc-FANCI-2. (A-B) RNA-seq reads-coverage of IGFBP3 and MCAM by IGV illustrates the increased expression of IGFBP3 and MCAM in lnc-FANCI-2 KO cells ΔPr-A9 and/or ΔYY1-D5 when compared to the parental WT CaSki cells.

Expression levels of lnc-FANCI-2 and MCAM are significantly associated with TCGA CESC cancer patients’ survival outcome. The expression of Inc-FANCI-2 (A) and of MCAM (B) along with survival of the same group of 304 cervical cancer patients were analyzed by our in-house GradientScanSurv pipeline as described (Yi et al 2018, PLoS One 13(12):e0207590). The open grey circles in each plot are ranked expression levels from high (left) to low (right) of Inc-FANCI-2 (A) and MCAM (B) from all patient samples as indicated by y-axis at the right side of the plot (in RSEM normalized value of RNAseq data). At each breaking point denoted by patient number (x-axis), the log rank test (y-axis at the left side of the plot) was performed between the higher expression group at the left side of the cut-point vs the lower expression group at the right side of the cut-point and the indicated log rank p-values (blue dots) are shown as at each cut-point at x-axis. The horizontal red line is the p=0.05 cutoff line for the log rank test p-values. If a log rank p-value (blue dot) at a cut-point is below the cutoff (p<0.05), there will be a vertical green line shown at the corresponding cut-point, indicative of a significant log rank test p-value, and also a brown diamond plotted either at the bottom part of the vertical green line (A) if higher expression of lnc-FANCI-2 led to less severe outcome (dying slower than the lower expressors), or at the top part of the vertical green line (B) if higher expression of MCAM led to more severe outcome (dying faster than the lower expressors).

The original GoodCount was defined as the number of significant log rank tests (the default significance level was set as 0.05) across all possible cut-points for the original dataset. We employed the bootstrap approach to permutate the data for 1000 times in this case, and for each time, a similar procedure was performed like the original dataset, and a permutated GoodCount was obtained for each permutated dataset. The GoodCount p-value was derived as the proportion of the times that the GoodCounts of permutated datasets were no less than that of the original dataset (see Yi et al 2018, PLoS One 13(12):e0207590) and shown at the left top corner of each plot, 0.014 for Inc-FANCI-2 in panel A and 0.004 for MCAM in panel B. A significant GoodCount p-value (blue dots in both panels A and B) is indicative of significant association of the expression levels with the corresponding survival outcomes of the analyzed patient samples. Derived p-values from univariant expression-based cox regression models (coxph p-value: derived by original gene expression values; or coxph p-value By Rank: derived by corresponding ranks of gene expression values) were shown at the top left corner of each plot. The cut-point-specific FDR (green dots) and FDR2 (red dots) were shown at the top right corner of each plot. FDR for each cut-point was defined as the portion of cases of permutated datasets that have log rank test p-values no higher than that of original dataset. FDR2 for each cut-point was defined as the portion of cases of permutated datasets that have log rank test p-values no higher than default setting of significance as 0.05.