circFL-seq reveals full-length circular RNAs with rolling circular reverse transcription and nanopore sequencing

  1. Zelin Liu
  2. Changyu Tao
  3. Shiwei Li
  4. Minghao Du
  5. Yongtai Bai
  6. Xueyan Hu
  7. Yu Li
  8. Jian Chen
  9. Ence Yang  Is a corresponding author
  1. Institute of Systems Biomedicine, Department of Medical Bioinformatics, School of Basic Medical Sciences, Peking University Health Science Center, Key Laboratory for Neuroscience, Ministry of Education/National Health Commission of China , NHC Key Laboratory of Medical Immunology (Peking University), China
  2. Department of Human Anatomy, Histology & Embryology, School of Basic Medical Sciences, Peking University Health Science Center, China
  3. Department of Radiation Medicine, School of Basic Medical Sciences, Peking University Health Science Center, China
  4. Department of Microbiology & Infectious Disease Center, School of Basic Medical Science Peking University Health Science Center, China
  5. Chinese Institute for Brain Research, China
5 figures, 3 tables and 15 additional files

Figures

Figure 1 with 1 supplement
Diagram of circFL-seq workflow.

(A) Experimental operation of circFL-seq consisted of circRNA enrichment, library construction, and nanopore sequencing. (B) PCR validation of rolling circle products from the circFL-seq cDNA library. The yellow and green lines indicate the positions of the PCR primers. The upward triangle, downward triangle, and circle symbols denote the 0-circle, 1-circle, and 2-circle cDNA products. (C) Computational pipeline of circFL-seq. circFL-seq clean reads were directly used in RG mode or were self-corrected for consensus sequences in cRG mode to reconstruct full-length circRNAs. circRNA, circular RNA.

Figure 1—source data 1

Original figures of gels.

This file includes figures with uncropped gels.

https://cdn.elifesciences.org/articles/69457/elife-69457-fig1-data1-v2.zip
Figure 1—figure supplement 1
Sanger sequencing of rolling circular bands.

PCR was performed with the circFL-seq library of HEK293T cells as a template.

Figure 2 with 8 supplements
Analysis of full-length circRNA in eight samples.

(A) Stacked bar plot represents the number of full-length circRNA isoforms detected by RG and cRG for six cell lines. (B) Expression correlation matrix for circRNA BSJs and isoforms of all samples. The color scale corresponds to Pearson’s correlation coefficient. (C) Stacked bar plot represents the number of circRNA isoforms with read counts≥5 from known or novel BSJs based on the circRNA database. (D) Boxplot showing the length distribution per isoform for circRNA isoforms with read counts≥5 in all samples. Box lefts or rights are lower or upper quartiles, the bar is the median, and whiskers are the median±1.5×interquartile range. (E) Stacked bar plot showing the fraction of exon numbers per isoform for circRNA isoforms with read counts≥5 in all samples. (F) Boxplot showing the length distribution per exon for circRNA isoforms with read counts≥5 in all samples. Box bottoms or tops are lower or upper quartiles, the bar is the median, and the whiskers are the median±1.5×interquartile range. (G) Diagram of four types of alternative splicing (AS) events in circRNA: exon skipping (ES), alternative 3′ splice site (A3SS), alternative 5′ splice site (A5SS), and intron retention (IR). (H) Plot showing the coverage of full-length circRNA reads in the position of CDR1as for circFL-seq data of six cell lines (replicate data were merged). Structures of the two isoforms of CDR1as are shown at the bottom. (I–K) AS events (one ES and one IR) of circ-TMEM138 detected by circFL-seq (I), agarose gel electrophoresis (J), and Sanger sequencing (K). Red/blue arcs are forward/reverse primers for validation of back-splicing junctions (BSJs) and forward splicing junctions (FSJs). Asterisks denote FSJs. Downward triangles denote BSJs. BSJ, back-splicing junction; circRNA, circular RNA; RG, reference guide.

Figure 2—figure supplement 1
Clean read distribution of circFL-seq data of six cell lines.

(A) Histograms showing the distribution of clean reads (blue) and full-length circRNA reads (red) for each sample. The percentages represent the circRNA read proportion in clean reads. (B) Histograms showing the distribution of full-length circRNA read amounts. circRNA, circular RNA.

Figure 2—figure supplement 2
CircRNA reads identified from circFL-seq data of six cell lines.

(A) Stacked bar plot representing the number of full-length circRNA isoforms detected by RG and cRG for eight samples. (B) Boxplot showing the read qscore distribution of circRNA isoforms of all samples. The qscore representing the read quality was extracted from the sequencing summary file. Boxplot showing the error rate of mismatches (C) and indels (D) for raw reads and consensus sequence (CS) of cRG-identified circRNA isoforms. circRNA, circular RNA; RG, reference guide.

Figure 2—figure supplement 3
Scatter plot showing the correlation of circRNA at the BSJ level (A, B) and isoform level (C, D) between circFL-seq replicates.

CircRNA BSJs/isoforms with read counts>0 in at least one replicate were included. BSJ, back-splicing junction; circRNA, circular RNA.

Figure 2—figure supplement 4
Diagram of circRNA types.

The classification is based on the positions of BSJs and boundary exons following the principles below. Exonic: the circRNA body is totally located inside of one gene from the same strand. Both of the BSJs are identical to annotated junctions. Intronic: the circRNA body is also totally located inside one gene from the same strand. However, at least one of the boundary exons does not overlap with any annotated exon. Novel splicing site (NSS): the circRNA body is also totally located inside one gene from the same strand. However, both boundary exons overlap with annotated exons, with at least one BSJ different from the annotated linear junction. Intergenic: the whole body of circRNA is located in an intergenic region. Novel UTR: the circRNA body partially overlaps with only one gene from the same strand, and at least one BSJ is located in an intergenic region. Antisense: there is no overlap between circRNA and any gene from the same strand. However, the circRNA overlaps gene(s) from the antisense strand. Read-through: BSJs are located in different genes with the same strand. BSJ, back-splicing junction; circRNA, circular RNA.

Figure 2—figure supplement 5
Cumulative distribution of read counts for circRNA isoforms identified by circFL-seq from six cell lines.

CircRNA isoforms were classified based on database status and annotation types. (A) For each annotation type (exonic, intronic, NSS, intergenic, novel UTR, antisense, and read-through), the cumulative distribution of read counts is classified to known and novel status based on whether their BSJs are annotated in the circRNA database. (B) For known and novel status, the cumulative distribution of read counts is classified into seven annotation types. BSJ, back-splicing junction; circRNA, circular RNA.

Figure 2—figure supplement 6
CircRNAs with exon skipping validated by RT-PCR and Sanger sequencing in HeLa cells.

RT-PCR was performed with RNase R-treated RNA. The coverage of full-length circRNA reads mapped to the reference genome is shown. circRNA, circular RNA.

Figure 2—figure supplement 7
CircRNAs with alternative 3′/5′ splicing sites (A3SS for circRNAs from MCU and MRS2, A5SS for circRNA from SNX25) were validated by RT-PCR and Sanger sequencing in HeLa cells.

RT-PCR was performed with RNase R-treated RNA. The coverage of full-length circRNA reads mapped to the reference genome is shown. A3SS, alternative 3′ splice site; alternative 5′ splice site; circRNA, circular RNA.

Figure 2—figure supplement 8
CircRNAs with intron retention validated by RT-PCR and Sanger sequencing in HeLa cells.

RT-PCR was performed with RNase R-treated RNA. The coverage of full-length circRNA reads mapped to the reference genome is shown. circRNA, circular RNA.

Figure 3 with 7 supplements
Quantification of circRNA at the BSJ and isoform levels.

(A) Expression correlation matrix of circRNA BSJ quantified by circFL-seq and RNA-seq for six cell lines. The numbers in the matrix represent Pearson’s correlation coefficients. (B) Comparison of differentially expressed circRNA (DEC) detection between circFL-seq and RNA-seq. Top panel: Venn diagram showing the number of DECs detected by circFL-seq (green), RNA-seq (purple), and both methods (orange). Bottom panel: scatter plot showing the correlation of fold change (log base 2) for HeLa and SKOV3 cells between circFL-seq and RNA-seq. (C) Scatter plot showing the correlation of the expression levels of 16 circRNA BSJs for HeLa (left) and SKOV3 (right) cells between circFL-seq and RT-qPCR. (D) Scatter plot showing the correlation of fold changes (log base 2) of the 16 BSJs for HeLa and SKOV3 cells between circFL-seq and RT-qPCR. (E) Plot showing the adjusted coverage of full-length circRNA reads and RNA-seq reads in the position of circRNA from PLOD2. The circular structures of the two circRNA isoforms are shown in the lower panel. (F) Scatter plot showing the correlation of the transcript ratio of 18 circRNA isoforms from nine circRNA BSJs (each BSJ has two isoforms) for HeLa (left) and SKOV3 (right) cells between circFL-seq and RT-qPCR. The relative expression of target BSJs/isoforms quantified by RT-qPCR was determined with RNase R-treated samples and GAPDH from total RNA without RNase R treatment as a reference. (G) Scatter plot showing the correlation of the differential ratio (∆ratio) of the 18 isoforms for HeLa and SKOV3 cells between circFL-seq and RT-qPCR. The shaded areas denote 95 % confidence intervals. BSJ, back-splicing junction; circRNA, circular RNA.

Figure 3—figure supplement 1
Correlations of circRNA BSJs among RNA-seq samples from six cell lines.

(A) Expression correlation matrix for circRNA BSJs among six cell lines. The color scale corresponds to Pearson’s correlation coefficients. Scatter plot showing the correlations of BSJs between HeLa (B) or SKOV3 (C) replicates. CircRNA BSJs with read counts > 0 in at least one replicate were included.

Figure 3—figure supplement 2
Venn diagram of BSJs detected by circFL-seq, RNA-seq, and database.

BSJ, back-splicing junction.

Figure 3—figure supplement 3
CircRNA read distribution of eight samples of six cell lines.

Bar plot showing the distribution of known or novel circRNA BSJs with different read counts as the threshold for circFL-seq (A) and RNA-seq (B) data. BSJ, back-splicing junction; circRNA, circular RNA.

Figure 3—figure supplement 4
Comparison of circFL-seq and RNA-seq for length of full-length circRNA of six cell lines.

For RNA-seq, full-length circRNAs were reconstructed by CIRI-full. circRNA, circular RNA.

Figure 3—figure supplement 5
Comparison of circFL-seq and isoCirc for full-length circRNA detection in the HEK293 cell line.

(A) Stacked bar plot showing the number of sequenced bases. circFL-seq includes fail and clean bases. Fail bases are from low-quality reads with qscore<7 and trimmed adapters. (B) Bar plot showing the number of full-length circRNA reads. (C) Bar plot showing full-length circRNA reads per 109 raw sequenced bases. (D) Bar plot showing the number of full-length circRNA isoforms. (E) Cumulative distribution of read counts of BSJs. (F–L) Stacked bar plot showing the distribution of known or novel circRNA BSJs for different read counts. (M) Plot showing the cumulative number of top expressed circRNAs of circFL-seq, isoCirc, and RNA-seq detected in the database. Stacked bar plot showing the distribution of read counts of common circRNA BSJs detected in circFL-seq (N) and isoCirc (O). ‘Same’ and ‘different’ represent BSJs w/wo the same isoforms between circFL-seq and isoCirc. The isoCirc analysis in (N, O) combines circRNA results from all six HEK293 isoCirc libraries. BSJ, back-splicing junction; circRNA, circular RNA.

Figure 3—figure supplement 6
Scatter plot showing the correlation of circRNA BSJs between circFL-seq and RNA-seq samples of six cell lines.

BSJ, back-splicing junction; circRNA, circular RNA.

Figure 3—figure supplement 7
Evaluation of circRNA quantification between circFL-seq and RT-qPCR.

The relative expression of target BSJs and isoforms quantified by RT-qPCR was performed with samples without RNase R treatment.GAPDH from total RNA without RNase R treatment was used as a reference. (A, B) Scatter plot showing the correlation of expression levels of 16 circRNA BSJs for HeLa (A) and SKOV3 (B) cells between circFL-seq and RT-qPCR. (C). Scatter plot showing the correlation of fold change (log base 2) of the 16 BSJs for HeLa and SKOV3 cells between circFL-seq and RT-qPCR. (D, E) Scatter plot showing the correlation of the transcript ratio of 18 circRNA isoforms from 9 circRNA BSJs (each BSJ has two isoforms) for HeLa (D) and SKOV3 (E) cells between circFL-seq and RT-qPCR. (F) Scatter plot showing the correlation of the differential ratio (∆ratio) of the 18 isoforms for HeLa and SKOV3 cells between circFL-seq and RT-qPCR. BSJ, back-splicing junction; circRNA, circular RNA.

Figure 4 with 2 supplements
Detection and validation of fusion circRNA (f-circRNA) in the MCF7 cell line.

(A) Diagram of identification of f-circRNA with circFL-seq data. (B) Diagram of five high-quality f-circRNA isoforms (read counts≥5) fused by GBF1 and MACROD2. The transcript ratio represents the fractions of the isoforms. (C–E) Validation of f-circRNA junctions from GBF1/MACROD2 by agarose gel electrophoresis (C), Sanger sequencing (D), and RT-qPCR (E). (C) Agarose gel electrophoresis showing the RT-PCR products of f-circRNA junctions with RNase R-treated MCF7 and HeLa RNA and poly(A) selected MCF7 RNA as a template. (F) Agarose gel electrophoresis showing the RT-PCR products of f-circRNA junctions from PRICKLE2-AS1/PTPRT-AS1. (G) Information on five f-circRNA junctions detected by circFL-seq, RNA-seq, and RT-qPCR.

Figure 4—figure supplement 1
Sanger validation of sequences of f-circRNA from GBF1/MACROD2.

The forward and reverse primers are highlighted. f-circRNA, fusion circRNA.

Figure 4—figure supplement 2
Sanger validation of the sequence of f-circRNA from PRICKLE2-AS1/PTPRT-AS1.

The forward and reverse primers are highlighted. f-circRNA, fusion circRNA.

Author response image 1
Alignment schematic of nanopore reads of isoCirc-only IR events.

Tables

Key resources table
Reagent type (species) or resourceDesignationSource or referenceIdentifiersAdditional information
Cell line (Homo sapiens)HeLaJiadong Wang LaboratoryRRID:CVCL_0030
Cell line (H. sapiens)SKOV3Jiadong Wang LaboratoryRRID:CVCL_0532
Cell line (H. sapiens)MCF7Jiadong Wang LaboratoryRRID:CVCL_0031
Cell line (H. sapiens)HEK293TJiadong Wang LaboratoryRRID:CVCL_0063
Cell line (H. sapiens)SH-SY5YJian Chen LaboratoryRRID:CVCL_0019
Cell line (H. sapiens)VCaPiCell BioscienceRRID:CVCL_2235
Cell line (H. sapiens)HEK293iCell BioscienceRRID:CVCL_0045
Commercial assay or kitTotal RNA of human brainClontechCat. #: 636530
Commercial assay or kitTotal RNA of human testisClontechCat. #: 636533
Author response table 1
Summary of confounding factors among three methods.
detailscircFL-seqCIRI-longisoCirc
input of total RNA (μg)2120
speciesHumanmousehuman
samples7 cell linesbrain and testisbrain1 cell line12 tissues
platformONT PromethION, MinIONONT MinIONONT MinION
libraries per sampleone/twomultiplemultiple
libraries per Flow Cell (sequencing depth)one/multiplemultipleone
Author response table 2
Comparison of isoCirc and circFL-seq for BSJ detection in HEK293 cell line.
total circRNA BSJs
# read counts for BSJ1234>4all
isoCirc HEK293 SRR1061205032,2043,5721,2376201,77739,410
isoCirc HEK293 SRR1061205134,4933,9161,3616871,91542,372
isoCirc HEK293 SRR1061205244,5865,2701,7078602,63555,058
isoCirc HEK293 SRR1061205339,4845,2741,8711,0222,97050,621
isoCirc HEK293 SRR1061205440,9285,2591,8971,0713,00952,164
isoCirc HEK293 SRR1061205530,6473,7791,3877102,02438,547
isoCirc HEK293 all158,87523,3028,8215,13326,782222,913
circFL-seq HEK29313,9064,8892,8301,5254,71927,869
known circRNA BSJs annotated in database
# read counts for BSJ1234>4all
isoCirc HEK293 SRR1061205010,4582,5281,0805711,62016,257
isoCirc HEK293 SRR1061205110,8892,6631,1946151,75117,112
isoCirc HEK293 SRR1061205212,7273,4471,4517732,39620,794
isoCirc HEK293 SRR1061205314,8283,8931,6659442,71124,041
isoCirc HEK293 SRR1061205415,0783,8341,6789712,75024,311
isoCirc HEK293 SRR1061205512,5342,9691,2646601,86019,287
isoCirc HEK293 all28,91712,0886,8214,41125,30177,538
circFL-seq HEK2938,8363,8212,3771,3654,58920,988
% known circRNA BSJs
# read counts for BSJ1234>4all
isoCirc HEK293 SRR1061205032.570.887.392.191.241.3
isoCirc HEK293 SRR1061205131.668.087.789.591.440.4
isoCirc HEK293 SRR1061205228.565.485.089.990.937.8
isoCirc HEK293 SRR1061205337.673.889.092.491.347.5
isoCirc HEK293 SRR1061205436.872.988.590.791.446.6
isoCirc HEK293 SRR1061205540.978.691.193.091.950.0
isoCirc HEK293 all18.251.977.385.994.534.8
circFL-seq HEK29363.578.284.089.597.275.3

Additional files

Supplementary file 1

Data summary of circFL-seq library.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp1-v2.docx
Supplementary file 2

Summary of alternative splicing events of circRNAs detected by circFL-seq.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp2-v2.docx
Supplementary file 3

Data summary of RNA-seq library.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp3-v2.docx
Supplementary file 4

Comparison of isoCirc and circFL-seq for circRNA detection in the HEK293 cell line.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp4-v2.docx
Supplementary file 5

Computational analysis of circFL-seq and CIRI-long.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp5-v2.docx
Supplementary file 6

Comparisons between circFL-seq, CIRI-long, and isoCirc.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp6-v2.docx
Supplementary file 7

Sequences of hybrid probes for rRNA degradation.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp7-v2.xlsx
Supplementary file 8

Summary of performance of strand classifier.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp8-v2.docx
Supplementary file 9

Primers to validate rolling circles of circRNAs.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp9-v2.xlsx
Supplementary file 10

Primers to validate alternative splicing of circRNAs.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp10-v2.xlsx
Supplementary file 11

Primers to validate the expression levels of circRNA BSJs by RT-qPCR.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp11-v2.xlsx
Supplementary file 12

Primers to validate the expression levels of circRNA isoforms by RT-qPCR.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp12-v2.xlsx
Supplementary file 13

Primers to validate full-length sequence of f-circRNA.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp13-v2.xlsx
Supplementary file 14

Primers to validate the expression levels of f-circRNA junctions by RT-qPCR.

https://cdn.elifesciences.org/articles/69457/elife-69457-supp14-v2.xlsx
Transparent reporting form
https://cdn.elifesciences.org/articles/69457/elife-69457-transrepform1-v2.docx

Download links

A two-part list of links to download the article, or parts of the article, in various formats.

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

  1. Zelin Liu
  2. Changyu Tao
  3. Shiwei Li
  4. Minghao Du
  5. Yongtai Bai
  6. Xueyan Hu
  7. Yu Li
  8. Jian Chen
  9. Ence Yang
(2021)
circFL-seq reveals full-length circular RNAs with rolling circular reverse transcription and nanopore sequencing
eLife 10:e69457.
https://doi.org/10.7554/eLife.69457